Methods and systems for determining m. tuberculosis infection

ABSTRACT

Embodiments of the invention relate to methods and systems for the detection of  Mycobacterium tuberculosis. Mycobacterium tuberculosis  kills more than one million people each year. To better understand why  M. tuberculosis  is virulent and to discover chemical markers of this pathogen, we compared its lipid profile to that of the attenuated but related  mycobacterium, Mycobacterium bovis  Bacille Calmette Guerin (BCG). This strategy identified previously unknown compounds that are specific to  M. tuberculosis , e.g. 1-tuberculosinyladenosine, N 6 -tuberculosinyladenosine, and various tuberculosinyladenosines having mycolic acids, produced by the Rv3378c enzyme.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit under 35 U.S.C. 119(e) of U.S.Provisional Application No. 61/840,125, filed on Jun. 27, 2013, thecontents of which are incorporated by reference in its entirety.

GOVERNMENT SUPPORT

This invention was made with Government support under grant numberRO1-A1049313 awarded by the National Institute of Allergy and InfectiousDisease (NIAID). The Government has certain rights in the invention.

FIELD OF INVENTION

Embodiments of the invention are directed to systems and methods fordetermining whether a subject is infected with Mycobacteriumtuberculosis (TB).

BACKGROUND OF INVENTION

Mycobacterium tuberculosis (M. tuberculosis) remains one of the world'smost important pathogens, with a mortality rate exceeding 1.5 milliondeaths annually (Dye C, et al. (2013) Annu Rev Public Health.34:271-286). Despite study of this pathogen for more than a century, thespectrum of natural lipids within M. tuberculosis membranes is not yetfully defined. For example, the products of many genes annotated aslipid synthases remain unknown (Camus J C, et al. (2002) Re-annotationof the genome sequence of Mycobacterium tuberculosis H37Rv Microbiology148:2967-2973), and mass spectrometry detects hundreds of ions that donot correspond to known lipids in the MycoMass database (Layre E, et al.(2011) A comparative lipidomics platform for chemotaxonomic analysis ofMycobacterium tuberculosis, Chem Biol 18(12):1537-1549 provides a methodto detect individual lipids that are present in infectious bacteria thatcause tuberculosis disease (M. tuberculosis) versus attenuated bacteriathat are used in vaccines such as BCG. In general, it is important todistinguish patients with tuberculosis from those vaccinated with BCGbecause the treatments are different and more than 1 billion people havebeen vaccinated with BCG.

Methods of detecting the presence or absence of the bacteria typicallyinclude culturing a sample suspected of having bacteria. However thesetests may take over two weeks to complete depending on how long it takesto isolate and grow the bacteria. Accordingly, while such biochemicaltesting is relatively inexpensive, it is time consuming to grow andsubculture bacteria in a sample to reach the minimal concentration ofbacteria needed for testing. One standard for the diagnosis of activepulmonary tuberculosis is sputum smear microscopy for acid-fast bacilli.If a patient's sputum tests positive for M. tuberculosis they haveactive pulmonary tuberculosis, are considered highly infectious, and areplaced on an exhaustive drug regimen for treatment. However, sputumsmear microscopy has low sensitivity and it is estimated that sputumsmear microscopy at best detects 25-60% of people with active pulmonarytuberculosis. The method also has relatively poor limits of detection asit requires the presence of at least 10,000 MTb bacilli/mL. Analternative to culture positivity is to detect bacterial DNA by PCR, butsuch methods are expensive and difficult to use in resource limitedsettings in which the tuberculosis epidemic is prevalent.

Serologic tests exist for M. tuberculosis diagnostics, but they continueto undergo development and tend to be more specific for exposure thanactive disease. Some commercialized tests use immunodominant antigens todetect immunoglobulin classes (like IgG) in an ELISA or dipstick format.Serological tests are estimated to detect one-third to three-quarters ofsputum smear-positive cases of MTb. They detect a significantly smallerportion of smear-negative cases with HIV co-infection. In fact, forpeople infected with both HIV and MTb, serological tests detect lessthan one third of patients with the active form of the disease. Manymolecular targets of current serological tests, such as mycolic acid andlipoarabinomannan, are produced by mycobacterial in the environment orvaccine strains. It is thought that vaccination or exposure toenvironmental mycobacteria causes false positive test results inpatients with no M. tuberculosis infection. Identification of moleculartargets that are expressed solely or mainly by M. tuberculosis and notother common mycobacteria is therefore expected to yield fewer falsepositive tests.

A widely used test to determine M. tuberculosis (TB) is the PPD(purified protein derivative) skin test. Patients are administered asmall shot that contains PPD under the top layer of the skin. A bump orsmall welt will form, which usually goes away in a few hours. If thearea of skin that received PPD is still reactive 48 to 72 hours afterthe injection, the test results are positive. People who received a BCG(bacille Calmette-Guerin) vaccine against tuberculosis give afalse-positive reaction to the PPD test. Many foreign-born people havehad the BCG vaccine, though it is not given in the U.S. due to itsquestionable effectiveness. Accordingly, even if one has beenvaccinated, they could still carry the disease. The PPD test does notdiscriminate between BCG vaccines and patients with M. tuberculosisinfection and tuberculosis disease. Thus, diagnosis of M. tuberculosisinfection is complicated by the fact that approximately 1 billion peopleworldwide have been treated with live Mycobacterium bovis BacilleCalmette Guerin (BCG) bacteria as a vaccine, and those persons that havebeen treated with this vaccine will show a false positive reading indiagnostic tests. In addition, the PPD test also known to show apositive reaction when a subject is infected with non-tuberculosismycobacteria.

Accordingly, more efficient methods and systems are needed to screenpatients suspected of having M. tuberculosis. In particular,identification of molecules that are produced by M. tuberculosis but notBCG provides the opportunity to develop molecular targets that will notcause false positive serological tests or biochemical tests thatdirectly detect the molecule of interest in ELISA or related methods.

Approximately 1.7 billion are infected with M. tuberculosis worldwide. Atest that can distinguish people that have been treated with the commonBCG vaccine, or that have non-tuberculous mycobacteria, from people thatactually have the pathognenic M. tuberculosis is of great value.

SUMMARY OF INVENTION

Aspects of the present invention are based, in part, on the discovery ofcompounds, herein referred to as Formula I, Formula II, Formula III, andFormula IV that are specifically expressed by pathogenic Mycobacteriumtuberculosis (M. tuberculosis), i.e. they are not present in mostmycobacteria, including highly related mycobacteria, avirulent(nonpathogenic) mycobacteria and environmental bacteria. Such specifictargets are also absent in other non-mycobacterial pathogens that causediseases that mimic the symptoms of tuberculosis. Significantly,detection of one or more compounds of Formula I-IV, or antibodies thatrecognize one or more compounds, does not result in false positivereadings in subjects that have received the common BCG vaccine, e.g. apositive result correctly indicates that the subject is infected with M.tuberculosis. Accordingly, provided herein are methods and computersystems for determining whether a subject is infected with M.tuberculosis. Such methods provide a great improvement over the existingdiagnostic technologies, 1) the test is specific for M. tuberculosis,and 2) the test can distinguish between a person that has beenvaccinated for M. tuberculosis and is not infected from one who actuallyis infected with M. tuberculosis.

In one aspect, a method of identifying Mycobacterium tuberculosis in asubject is provided. The method comprises measuring the presence orabsence of at least one compound selected from the group consisting of acompound of Formula I (1-tuberculosinyladenosie), Formula II(N⁶-tuberculosinyladenosine) and Formula III (amycoloyl-tuberculosinyladenosine), in a biological sample that isderived from a subject suspected of having Mycobacterium tuberculosisinfection, wherein the presence of the at least one compound of step a)is indicative that the subject has Mycobacterium tuberculosis infection.In one embodiment, the subject is tested in widespread screening of thepopulation to detect tuberculosis. For example, since infection with TBis so prevalent, the entire population can be suspected of having TBinfection.

In one embodiment, the presence of the at least two compounds of step a)is indicative that the subject has Mycobacterium tuberculosis infection.

In one embodiment, the presence of the at least three compounds of stepa) is indicative that the subject has Mycobacterium tuberculosisinfection.

In one embodiment, the method further comprises administering to thesubject a treatment for Mycobacterium tuberculosis.

In another aspect, a method for treatment of Mycobacterium tuberculosiscomprising: administering a pharmaceutically effective amount of aMycobacterium tuberculosis therapeutic to a subject that has thepresence of at least one compound selected from the group consisting ofa compound of Formula I, Formula II and Formula III.

In one embodiment, the pharmaceutically effective amount of aMycobacterium tuberculosis therapeutic is administered to a subject thathas presence of at least two compounds selected from the groupconsisting of a compound of Formula I, Formula II and Formula III.

In one embodiment, the pharmaceutically effective amount of aMycobacterium tuberculosis therapeutic is administered to a subject thathas presence of a compound of Formula I, Formula II and of Formula III.

In another aspect, a method for determining if a subject is responsiveto a Mycobacterium tuberculosis treatment is provided. The methodcomprises a)measuring the concentration of at least one compoundselected from the group consisting of a compound of Formula I, FormulaII and Formula III, in a first sample from a subject; b)administering tothe subject a treatment for Mycobacterium tuberculosis; and c) measuringthe concentration of the one or more compounds of step a) in a secondsample from the subject, wherein a decrease in concentration of thecompound as compared to the concentration in the first sample isindicative that the subject is responding the treatment forMycobacterium tuberculosis and reducing infection.

In one embodiment of any of the above aspects, the compound is a variantof the compound of Formula III represented by Formula IV (i.e.mycoloyl-tuberculosinyladenosine as provided having R groups of C85methoxy mycolate and C78 alpha mycolate, or other mycolyl variantsdescribed below).

In one embodiment of any of the above aspects, the subject suspected ofhaving Mycobacterium tuberculosis infection has been diagnosed as havinga bacterial infection.

In one embodiment of any of the above aspects, the subject is human.

In one embodiment of any of the above aspects, the biological samplederived from the subject is selected from the group consisting of:breath, sputum, blood, urine, gastric lavage and pleural fluid.

In one embodiment of any of the above aspects, the presence of thecompound is measured using an assay selected from the group consistingof: mass spectrometry (MS), nuclear magnetic resonance spectroscopy andan immunoassay. (e.g. high performance liquid chromatography massspectrometry (HPLC-MS or collision induced mass spectrometry (CID-MS),or an immunoassay to detect host antibodies against a compound ofFormula I-IV.

In one embodiment of any of the above aspects, the assay is animmunoassay that detects the presence of the compound/s by monitoringthe presence of host antibodies directed against the compound/s. (e.g.ELISA).

In another aspect, a system for analyzing a biological sample isprovided. The system comprises, a) a determination module configured toreceive data form measuring a compound present in a biological sample ofa subject suspected of having Mycobacterium tuberculosis infection (e.g.a subject that is part of a screening protocol), wherein the compound isselected from the group consisting of a compound of Formula I, FormulaII and Formula III, and to optionally determine the concentration of thecompound; b) a storage device configured to store information from thedetermination module; c) a comparison module adapted to compare the datastored on the storage device with reference data, and to provide acomparison result, wherein the comparison result identifies the presenceor absence of at least one compound selected from the group consistingof a compound of Formula I, Formula II, and Formula III; and wherein thepresence of the at least one compound is indicative that the subject hasMycobacterium tuberculosis infection; and d) a display module fordisplaying a content based in part on the comparison result for theuser, wherein the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least onecompound of step c), or a signal indicative that the subject lacksMycobacterium tuberculosis infection in the absence of each of thecompounds of Formula I, Formula II and Formula III.

In one embodiment, in step d) the content is a signal indicative thatthe subject has Mycobacterium tuberculosis infection in the presence ofat least two compounds of step c), or a signal indicative that thesubject lacks Mycobacterium tuberculosis infection in the absence of atleast two of the compounds of step c),In one embodiment, in step d) thecontent is a signal indicative that the subject has Mycobacteriumtuberculosis infection in the presence of at least three singlecompounds of step c).

In one embodiment, the system has content that further comprises asignal indicating that the subject should be treated for Mycobacteriumtuberculosis in the presence of at least one compound selected from thegroup consisting of Formula I, Formula II, and Formula III.

In one embodiment of the system, the compound of Formula III isrepresented by Formula IV.

In one embodiment of the system the determination module is configuredto receive data from a Mass Spectrometer.

In one embodiment of the system, the subject suspected of havingMycobacterium tuberculosis infection has been diagnosed as having abacterial infection.

In one embodiment of the system, the subject is human.

In one embodiment of the system, the biological sample derived from thesubject is selected from the group consisting of: breath, sputum, blood,urine, gastric lavage and pleural fluid.

In one embodiment of the system, the determination module receives datafrom a mass spectrometer, nuclear magnetic resonance spectroscopy, highperformance liquid chromatography, or an immunoassay (e.g. data from anELISA plate reader).

In another aspect, a computer readable medium having computer readableinstructions recorded thereon to define software modules including acomparison module and a display module for implementing a method on acomputer is provided. The method implemented in this aspect comprises:a) comparing with the comparison module the data stored on a storagedevice with reference data to provide a comparison result, wherein thecomparison result identifies the presence or absence of at least onecompound selected from the group consisting of a compound of Formula I,Formula II, and Formula III; and wherein the presence of the at leastone compound is indicative that the subject has Mycobacteriumtuberculosis infection , and b) displaying a content based in part onthe comparison result for the user, wherein the content is a signalindicative of that the subject has Mycobacterium tuberculosis infectionin the presence of at least one compound of step a), or a signalindicative that the subject lacks Mycobacterium tuberculosis infectionin the absence of each of the compounds of Formula I, Formula II andFormula III.

In one embodiment of the computer readable medium, in step b) thecontent is a signal indicative that the subject has Mycobacteriumtuberculosis infection in the presence of at least two compounds of stepc), or a signal indicative that the subject lacks Mycobacteriumtuberculosis infection in the absence of at least two of the compoundsof step c).

In one embodiment of the computer readable medium, in step b) thecontent is a signal indicative that the subject has Mycobacteriumtuberculosis infection in the presence of at least three singlecompounds of step c).

In one embodiment of the computer readable medium, the compound ofFormula III is represented by Formula IV.

In one embodiment of the computer readable medium, the content furthercomprises a signal indicating that the subject should be treated forMycobacterium tuberculosis in the presence of at least one compoundselected from the group consisting of Formula I, Formula II, Formula IIIand Formula IV.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A to 1C show graphs indicating the comparative lipidomic analysisof M. tuberculosis and BCG and reveals a natural product constitutivelyproduced and exported by M. tuberculosis. (FIG. 1A) Detected molecularfeatures are shown as a scatterplot of intensity derived from M.tuberculosis H37Rv and M. bovis BCG lipid extracts. Each featurecorresponds to a detected ion and contains retention time and m/zvalues, which are detailed in Database S1. 1,845 features out of 7,852total features showed intensity ratios that deviate significantly from 1(corrected p-value<0.05). The mass spectrum corresponds to the four M.tuberculosis-specific features of substance A. (FIG. 1B) Ionchromatograms extracted at m/z (540.3545) and retention time ofsubstance A were used for the analysis of lipid extracts of referencestrains. (FIG. 1C) Ion chromatograms from lipidomic analysis of filteredconditioned medium were extracted at the m/z of substance A or controlcompounds that are secreted (carboxymycobactin) and cell wall-associatedlipids (trehalose monomycolate, mycobactin).

FIG. 2 shows the chemical structure of 1-tuberculosinyladenosine.Substance A was purified from M. tuberculosis lipid extracts wascharacterized using CID-MS and NMR (800 MHz) analyses yielding keycollision products and resonances as indicated. These data establishthat substance A is 1-tuberculosinyladenosine (1-TbAd).

FIGS. 3A to 3C show a schematic of the screen to identify M.tuberculosis biosynthesis of substance A and graphs indicating that itrequires Rv3378c (FIG. 3A) The screening of 4,196 transposon mutants ofM. tuberculosis H37Rv using a rapid 3 minute HPLC-MS method yielded 30strains with reduced 1-TbAd signal. (FIG. 3B) Rescreening with regular40 minute HPLC-MS method confirmed absence of 1-TbAd signal in twomutants. (FIG. 3C) Ion chromatograms, both mutants were found to havespontaneous, non-transposon induced mutations in Rv3378c and weresubject to complementation of Rv3377c-Rv3378c and reanalysis for 1-TbAdproduction.

FIGS. 4A to 4D show schematics of 1-TbAd synthesis (FIG. 4A and FIG.4B), and ion chromatograms of synthesis indicating that Rv3378c acts asa tuberculosinyl transferase (FIGS. 4C and 4D). (FIG. 4A) Rv3377c andRv3378c are currently thought to produce tuberculosinol andisotuberculosinol. (FIG. 4B) The existence of 1-TbAd might be explainedby a revised function of the Rv3378c enzyme, which acts as atuberculosinyl transferase. Ion chromatograms, mass spectra (insets)(FIG. 4C) and CID-MS (FIG. 4D) of the 1-TbAd standard and reactionproducts of enzymatic assays performed using recombinant Rv3378c protein(inset, chemical structure). These data prove that recombinant Rv3378cenzyme produces 1-TbAd

FIG. 5 shows extracted ion chromatograms and mass spectra (insets)depicting the expression of Rv3377c-Rv3378c is sufficient for productionof 1-TbAd in M. smegmatis. Extracted ion chromatograms and mass spectra(insets) of 1-TbAd (m/z 540.3545) for the HPLC-MS analysis of lipidextracts from M. tuberculosis, M. smegmatis parental or Rv3377c-Rv3378cknock in strains.

FIGS. 6A to 6E show schematics of the molecular structure of Rv3378c.Rv3378c adopts a (Z)-prenyl transferase fold. (FIG. 6A) Structure ofRv3378c dimer is compared to conventional (Z)-prenyl transferases. (FIG.6B) Superposition of the active site of Rv3378c and other (Z)-prenyltransferases with the pyrophosphate bound to Rv2361c (stick) showsconserved key residues for substrate binding and catalysis (Rv3378c:blue, Rv2361c: yellow, Rv1086: gray, E. coli UPP synthase: magenta forcarbon atoms). (FIG. 6C) The monomeric subunits of Rv3378c and Rv2361cwere superimposed and Rv2361c substrates (sphere, carbon: yellow/gray,oxygen: red, phosphate: orange) are modeled in the active site ofRv3378c. The conserved residue, Asp34 is shown as a stick model and themagnesium ion is shown as a magenta sphere. (FIG. 6D) Proposed model ofRv3378c shows two substrate pockets with hydrophobic residues lining thepredicted prenyl binding pocket and D34 positioned adjacent to thepredicted adenosine binding pocket. (FIG. 6B-FIG. 6D) The flexibleP-loop of Rv3378c (residues 80-95) is colored in red with dotted linefor disordered region (residues 84-90). (FIG. 6E) The translucentsurface of Rv3378c was modeled with substrates (spheres) using the sameview as (FIG. 6D).

FIGS. 7A to 7B shows the chemical structures of (FIG. 7A) Formula I(1-tuberculosinyladenosine (1-TbAd)); (FIG. 7B) Formula II(N⁶-tuberculosinyladenosine (N⁶-TbAd); (FIG. 7C) Formula IV(mycoloyl-tuberculosinyladenosine (MTbAd)), that have been determined tobe specific for M. tuberculosis.

FIGS. 8A to 8C shows graphs indicating the detection of M. tuberculosisderived 1-TbAd during (FIG. 8A) exponential or stationary phase (FIG.8B) in neutral and acid pH medium. Substance A constitutivelyaccumulates independently of the ESX-1 apparatus. Overall, these datashow that 1-TbAd is constitutively produced under a wide variety ofconditions.

FIG. 9 shows ion chromatograms depicting that complementation of Rv1796and Rv2867c failed to restore TbAd production. Ions chromatogramsextracted at m/z 540.3545 within 10 ppm mass accuracy corresponding to1-TbAd. In contrary to Rv3377c-Rv337c, the complementation of tnRvl796or tnRv2867c mutant strains by Rv1796 or Rv2867c, respectively, does notrestore the production of TbAd.

FIG. 10 shows collision peak data indicating that expression ofRv3377c-Rv3378c in M. smegmatis is sufficient for the biosynthesis of1-TbAd. Collisional experiment on the molecule detected, at the same m/zand retention time as M. tuberculosis 1-Tbad, in the lipid extract of M.smegmatis transformed by Rv3377c-Rv3378c, which also shows thecharacteristic fragmentation pattern of 1-TbAd. Thus, 1-TbAd is theproduce of the Rv3377c3378c locus.

FIG. 11 shows ion chromatograms depicting that aspartate 34 is requiredfor the terpenyl transferase activity of Rv3378c in vitro. Ionchromatograms of the 1-TbAd in reaction products of enzymatic assaysperformed using wild type or aspartate 34 mutant Rv3378c protein.

FIG. 12 shows a block diagram showing an example of a system fordetermining a need for treatment of M. tuberculosis infection.

FIG. 13 shows a block diagram showing exemplary instructions on acomputer readable medium for determining M. tuberculosis (TB) infectionin an individual.

FIG. 14 shows spectrum data. Collisional Mass Spectrometry generates alow mass ion series of geranylgeraniol, tuberculosinol and substance A.The low-mass ion series of geranylgeraniol and tuberculosinol arecompared with the MS3 spectrum of substance A from M. tuberculosis.Under nanoelectrospray conditions using methanol at 700 V, thediterepene alcohols yielded ions arising from loss of water from theprotonated parent alcohol that are analogous to the m/z 273 ion found inthe spectrum of 1-TbAd (substance A). All three samples produce similarCID spectra, but the relative peak intensities of fragment ions of1-TbAd more closely match those of tuberculosinol than geranylgeraniol,particularly for ions corresponding to m/z 191.2, 189.2 and 163.2.

FIG. 15 shows a summary of NMR data, with assignments for natural1-tuberculosinyladenosine from M. tuberculosis. Purified substance A wasanalyzed in CD₃OD at 800 MHz using a Bruker Avance 800 with this summarysupported by spectra the NMR Spectra obtained (not shown).

FIG. 16 shows CID-MS spectra of substance A. The ion detected at m/z 136(adenine) that arises from collision induced dissociation of either m/z408 or m/z 268 indicates that both the C20H32 diterpene fragment, lostfrom m/z 408, and the C5H804 fragment, lost from m/z 268, are connectedto adenine. The fragmentations leading to m/z 136, 268, and 408 involvehydrogen transfer to the adenine group. The m/z 136 ion arises throughsequential losses of 272 Da and 132 Da. These spectra are consistentwith a central adenine core structure separately connected to ribose andditerpene units.

As used herein the term “Figure” is interchangeable with the term “Fig.”

DETAILED DESCRIPTION

Embodiments of the invention are based, in part, upon the discovery ofcompounds, i.e. Formula I (1-tuberculosinyladenosine (1-TbAd)); FormulaII (N⁶-tuberculosinyladenosine (N⁶-TbAd); Formula III (atuberculosinyladenosine comprising mycolic acid) and Formula IV(mycoloyl-tuberculosinyladenosine (MTbAd)), that have been determined tobe specific for M. tuberculosis. These compounds are directly useful forthe diagnosis of M. tuberculosis infection, and thus are useful fordetermining a need for treatment of M. tuberculosis in subjectssuspected of having M. tuberculosis, e.g. in healthy individuals, insubjects having a bacterial infection, or in subjects exhibiting asymptom of M. tuberculosis). Accordingly, provided herein are methodsand computer systems for determining M. tuberculosis infection andtreatment.

To identify lipids with roles in tuberculosis disease, we systematicallycompared the lipid content of virulent Mycobacterium tuberculosis withthe attenuated vaccine strain M. bovis BCG. Comparative lipidomicsanalysis identified more than 1,000 molecular differences, including apreviously unknown, M. tuberculosis-specific lipid that is composed of aditerpene unit linked to adenosine. We established the completestructure of the natural product as 1-tuberculosinyladenosine (1-TbAd)(also known as Formula I herein) using mass spectrometry, which waslater supported by nuclear magnetic resonance (NMR) spectroscopy. Wealso identified N⁶-tuberculosinyladenosine (also known as Formula IIherein); a tuberculosinyladenosine comprising mycolic acid (also knownas Formula III herein); and mycoloyl-tuberculosinyladenosine (MTbAd),also known as Formula IV herein).

As used herein the terms “Mycobacterium tuberculosis,” “TB,” “MTb,” “M.tuberculosis” and “pathogenic Mycobacterium tuberculosis” are usedinterchangeably. The term Mycobacterium tuberculosis refers to apathogenic (e.g. virulent) bacterial species in the familyMycobacteriaceae and a causative agent of tuberculosis (TB) (See IsmaelKassim, Ray C G (editors) (2004) “Sherris Medical Microbiology” (4thed.)). As used herein the term “pathogenic” refers to a bacterium thatis capable of causing disease in a host. TB bacteria usually attack thelungs, but can attack any part of the body such as the kidney, spine,and brain. If not treated properly, TB disease can be fatal. It shouldbe noted that the methods and the systems described herein are capableof detecting latent TB infection because antibody responses are durableduring the latent stage, and at least small amounts of the compound aremade when the bacterium is alive in the body. As used herein, “latent”TB infection refers to a patient that is infected with Mycobacteriumtuberculosis, but the patient does not have active tuberculosis diseasethat is infectious.

One of skill in the art understands that there are multiple isolates ofthe same bacteria and that there are multiple strains (isolates) of M.tuberculosis. Each of the various isolates of M. tuberculosis can bedetected using the systems and methods described herein. Somerepresentative isolate strains of Mycobacterium tuberculosis include,but are not limited to, Mycobacterium tuberculosis EA15/NITR206 GenebankAccession: NC_021194.1; Mycobacterium tuberculosis PanR0802 GenebankAccession: NZ_CM002050.1; Mycobacterium tuberculosis H37Rv GenebankAccession: NC_000962.3; Mycobacterium tuberculosis KZN 1435 GenebankAccession: NC_012943.1; Mycobacterium tuberculosis SUMu002 GenebankAccession: NZ_ADHR00000000.1; Mycobacterium tuberculosis CCDC5079Genebank Accession: NC_017523.1; Mycobacterium tuberculosis PanR0208Genebank Accession: NZ_CM002055.1; Mycobacterium tuberculosis KZN V2475Genebank Accession: NZ_CM000788.2; Mycobacterium tuberculosis HN878Genebank Accession: NZ_CM001043.1; Mycobacterium tuberculosis H37RvCOGenebank Accession: NZ_CM001515.1; Mycobacterium tuberculosis SUMu001Genebank Accession: NZ_ADHQ00000000.1; Mycobacterium tuberculosisS96-129 Genebank Accession: NZ_AEGB00000000.1; Mycobacteriumtuberculosis PanR1005Genebank Accession: NZ_CM002051.1; Mycobacteriumtuberculosis PanR0407 Genebank Accession: NZ_ATEB00000000; Mycobacteriumtuberculosis PanR0315 Genebank Accession: NZ_ATEJ00000000.1;Mycobacterium tuberculosis NA-A0009 Genebank Accession: NZ_ALYH00000000;Mycobacterium tuberculosis H37Rv Genebank Accession: NC_000962.3;Mycobacterium tuberculosis str. Beijing/NITR203 Genebank Accession:NC_021054.1; Mycobacterium tuberculosis H37Ra Genebank Accession:NC_009525.1; Mycobacterium tuberculosis F11 Genebank Accession:NC_009565.1; Mycobacterium tuberculosis 7199-99 Genebank Accession:NC_020089.1; Mycobacterium tuberculosis str. Haarlem Genebank Accession:NC_022350.1; Mycobacterium tuberculosis CDC1551 Genebank Accession: NC002755.2; Mycobacterium tuberculosis str. Erdman=ATCC 35801 NC_020559.1.

In one embodiment the Mycobacterium tuberculosis isolate is selectedfrom the group consisting of Mycobacterium tuberculosis GenebankAccession: H37Rv NC_000962.3; Mycobacterium tuberculosis str.Beijing/NITR203 Genebank Accession: NC_021054.1; Mycobacteriumtuberculosis H37Ra Genebank Accession: NC_009525.1; Mycobacteriumtuberculosis F 11 Genebank Accession: NC_009565.1; Mycobacteriumtuberculosis 7199-99 Genebank Accession: NC_020089.1; Mycobacteriumtuberculosis str. Haarlem Genebank Accession: NC_022350.1; Mycobacteriumtuberculosis CDC1551 Genebank Accession: NC_002755.2; Mycobacteriumtuberculosis str. Erdman=ATCC 35801 NC_020559.1. In one embodiment theMycobacterium tuberculosis is Mycobacterium tuberculosis GenebankAccession: H37Rv.

In one embodiment the Mycobacterium tuberculosis isolate is selectedfrom the group consisting of Mycobacterium tuberculosis EAI5/NITR206Genebank Accession: NC_021194.1; Mycobacterium tuberculosis PanR0802Genebank Accession: NZ_CM002050.1; Mycobacterium tuberculosis H37RvGenebank Accession: NC_000962.3; Mycobacterium tuberculosis KZN 1435Genebank Accession: NC_012943.1; Mycobacterium tuberculosis SUMu002Genebank Accession: NZ_ADHR00000000.1; Mycobacterium tuberculosisCCDC5079 Genebank Accession: NC_017523.1; Mycobacterium tuberculosisPanR0208 Genebank Accession: NZ_CM002055.1; Mycobacterium tuberculosisKZN V2475 Genebank Accession: NZ_CM000788.2; Mycobacterium tuberculosisHN878 Genebank Accession: NZ_CM001043.1; Mycobacterium tuberculosisH37RvCO Genebank Accession: NZ_CM001515.1; Mycobacterium tuberculosisSUMu001 Genebank Accession: NZ_ADHQ00000000.1; Mycobacteriumtuberculosis S96-129 Genebank Accession: NZ_AEGB00000000.1;Mycobacterium tuberculosis PanR1005Genebank Accession: NZ_CM002051.1;Mycobacterium tuberculosis PanR0407 Genebank Accession: NZ_ATEB00000000;Mycobacterium tuberculosis PanR0315 Genebank Accession:NZ_ATEJ00000000.1; and Mycobacterium tuberculosis NA-A0009 GenebankAccession: NZ_ALYH00000000.

Methods and systems of the invention are particularly useful forscreening all members of the population for TB, i.e. including healthyindividuals. TB infection is so prevalent that the whole population issuspected of having TB infection, e.g. latent infection showing nosymptoms of the disease. The World Health Organization (WHO) estimatesthat between 1.5 and 2 billion people worldwide have latent TB upon, ande.g. upon entry into most school systems screening for Mycobacteriumtuberculosis infection is a required test. The tests described hereincan replace the standard PPD test that is currently used to mass screenfor TB infection. The PPD test is a tuberculosis skin test used todetermine if someone has developed an immune response to Mycobacteriumtuberculosis, this response can occur if someone currently has TB, ifthey were exposed to it in the past, or if they received the BCG vaccineagainst TB (which is not commonly administered in the U.S.). The PPDtest is commonly used to screen healthy adults and children, as billionsof people worldwide have latent TB and show no signs of the infection,and around 2 to 3 million people worldwide die of TB each year. However,the PPD test has a disadvantage in that it will positively identify anuninfected subject as having a positive PPD test, if the subject hasreceived a BCG vaccine. The methods, assays, and systems describedherein will not falsely identify those that have received a BCG vaccineas being positive for TB, as such patients, unless they are trulyinfected with Mycobacterium tuberculosis will not show the presence ofthe compounds of Formula 1-IV which are specific for Mycobacteriumtuberculosis. In addition a subject that has been administered the BCGvaccine will not show a positive reactive immune reaction against thecompounds of Formula 1-IV, which are specific for Mycobacteriumtuberculosis. In a related idea, serological tests for M. tuberculosisinfection have not been widely implemented, and one key reason for thisis that in endemic areas environmental bacteria are common, and exposureto environmental bacteria causes false positive tests for patientantibodies. Because the compounds (I-IV) are produced only by M.tuberculosis and the Rv3378c gene, which is required for theirproduction, is absent in all known strains of environmental bacteria,serological tests based on compounds I-IV should not be hindered by thisknown mechanism of false positivity based on endemic environmentalmycobacteria.

In certain embodiments, the subject to be tested for Mycobacteriumtuberculosis (TB) infection is first selected as having one or moresymptoms of TB infection. Symptoms of TB disease depend on where in thebody the TB bacteria are growing. TB disease symptoms include, but arenot limited to, a persistent cough that lasts 3 weeks or longer, pain inthe chest, coughing up blood or sputum (phlegm from deep inside thelungs), weakness or fatigue, weight loss, no appetite, chills, fever,and sweating at night. One of skill in the art is well versed inassessing such symptoms.

In certain embodiments, the subject to be tested for Mycobacteriumtuberculosis infection has previously been diagnosed as having abacterial infection. Methods for diagnosing bacterial infection are wellknown to those of skill in the art and include, for example, completeblood count and cultures of fluid suspected of bacterial infection. Thismay include e.g., a blood culture, a urine culture, a spinal culture(which requires a spinal tap), or sputum culture. Another common methodfor determining bacterial infection is the Gram stain, which is a rapid,inexpensive method for demonstrating the presence of bacteria and fungi,as well as inflammatory cells using microscopy. These methods arefurther described in the following textbook: Kliegman: Nelson Textbookof Pediatrics, 19th ed. (2011) Saunders, an Imprint of Elsevier,Philadelphia U.S.A.; See Chapter 164: Diagnostic Microbiology, by AnitaK. M. Zaidi and Donald A. Goldmann.

Methods, computer systems, media, and assays are provided herein fordetermining infection with M. tuberculosis. In embodiments of theinvention, determination of infection with M. tuberculosis comprisesdetermining the presence or absence of one or more compounds of FormulaI-IV in a biological sample that has been taken from a subject. Thepresence of one or more compounds is indicative that the individual isinfected with TB.

As used herein Formula I refers to 1-tuberculosinyladenosine (1-TbAd)having the following chemical structure:

As used herein Formula II refers to N⁶-tuberculosinyladenosing (N⁶-TbAd)having the following chemical structure:

As used herein Formula III refers to a mycoloyl-tuberculosinyladenosine(Mucoloyl TbAd) having the following chemical structure:

-   wherein:-   R¹ is H or

-   R² is absent or

provided that one of R² and R³ is

-   R³ and R⁴ are selected independently from hydrogen, mycolic acids,    and any combinations thereof, provided that at least one of R³ and    R⁴ is α mycolic acid.

In embodiments of compounds of Formula III, R¹ can be

and R² can be absent. In some other embodiments, R¹ can be H and R² canbe

It is noted that when R² is

the nitrogen it is attached to carries a positive charge.

In compounds of Formula III, only one or both of R³ and R⁴ can be αmycolic acid. When R³ and R⁴ both are mycolic acids, they can be thesame or different. In addition, they can be same type of mycolic acid.In some embodiments, one of R³ and R⁴ is hydrogen and the other is αmycolic acid. In one embodiment, R³ is hydrogen and R⁴ is α mycolicacid. In another embodiment, R³ is α mycolic acid and R⁴ is hydrogen.

Mycolic acids are very long chain (up to C95) α-branched andβ-hydroxylated fatty acids (Laval et al. Anal Chem, 2001, 73: 4537-4544,content of which is incorporated herein by reference in its entirety).Mycolic acids can be described as a β-hydroxy acid substituted at theα-position with a moderately long aliphatic chain. Generally, mycolicacids are composed of a longer beta hydroxy chain with a shorteralpha-alkyl side chain. Mostly, mycolic acids contain between 30 and 90carbon atoms. The exact number of carbons varies by species and can beused as an identification aid. Most mycolic acids also contain variousfunctional groups. In some embodiments, mycolic acid is α mycolic acidproduced by Mycobacterium tuberculosis. Exemplary mycolic acids include,but are not limited to, α-mycolic acids, α′-mycolic acids,methoxymycolic acids, ketomycolic acids, epoxymycolic acids.

Generally, α-mycolic acids are of structure:CH₃—(CH₂)_(n)-A-(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃;α′-mycolic acids of structure:CH₃—(CH₂)_(n)-A-CH═CH—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃;methoxymycolic acids of structure:CH₃—(CH₂)_(n)—CH(CH₃)—CH(OCH₃)—(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃;ketomycolic acids of structure:CH₃—(CH₂)_(n)—CH(CH₃)—C(O)—(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃;epoxymycolic acids of structure:CH₃—(CH₂)_(n)—CH(CH₃)—X—(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)-—CH(CO₂H)—(CH₂)_(q)—CH₃,wherein X is

ω-carboxymycolic acids of structure:HO—C(O)—(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃; andω1-carboxymycolic acids of structure:CH₃—CH(OCH₃)—(CH₂)_(n)-A-(CH₂)_(m)—B—(CH₂)_(p)—CH(OH)—CH(CO₂H)—(CH₂)_(q)—CH₃,wherein A is CH═CH (cis or trans), CH(CH₃)—CH═CH (cis or trans), or

(cis or trans); B is CH═CH (cis or trans), CH═CH—CH(CH₃) (cis or trans),

(cis or trans), or

(cis or trans); n, m, p and q are independently 12, 13, 14, 15, 16, 17,18, 19, 20, 21, 22, 23, 24, or 25.

Preferably, A is CH═CH (cis), CH(CH₃)—CH═CH (trans), or

Preferred B are CH═CH (cis), CH═CH—CH(CH₃) (trans),

(trans).

In the above described structures of mycolic acids, n and p can beindependently 13, 15, 17, 19, or 21. When no methyl branch is present inboth A and B (i.e. A is CH═CH or

and B is CH═CH or

or a methyl branch is present in both A and B (i.e., A is CH(CH₃)—CH═CHand B is CH═CH—CH(CH₃) or

m can be 10, 12, 14, 16, or 18. When a methyl branch is present ineither A or B, m can be 11, 13, 15, 17, or 19. Generally, q is 17, 19,21, or 25.

In some embodiments, R³ and R⁴ are mycolic acids selected independentlyfrom the group consisting of α′ mycolic acids (56 carbons), α′ mycolicacids (58 carbons), α′ mycolic acids (60 carbons), α′ mycolic acids (62carbons), α′ mycolic acids (64 carbons), α′ mycolic acids (66 carbons),α′ mycolic acids (68 carbons), α mycolic acids (69 carbons), α mycolicacids (70 carbons), α mycolic acids (71 carbons), α mycolic acids (72carbons), ω1-methoxy mycolic acids (71 carbons), α mycolic acids (73carbons), keto mycolic acids (72 carbons), ω1-methoxy mycolic acids (72carbons), ω1-methoxy mycolic acids (72 carbons), α mycolic acids (74carbons), keto mycolic acids (73 carbons), ω1-methoxy mycolic acids (73carbons), methoxy mycolic acids (73 carbons), ω1-methoxy mycolic acids(73 carbons), ω1-methoxy mycolic acids (73 carbons), α mycolic acids (75carbons), keto mycolic acids (74 carbons), ω1-methoxy mycolic acids (74carbons), methoxy mycolic acids (74 carbons), ω1-methoxy mycolic acids(74 carbons), α mycolic acids (76 carbons), keto mycolic acids (75carbons), ω1-methoxy mycolic acids (75 carbons), methoxy mycolic acids(75 carbons), ω1-methoxy mycolic acids (75 carbons), ω1-methoxy mycolicacids (75 carbons), α mycolic acids (77 carbons), keto mycolic acids (76carbons), ω1-methoxy mycolic acids (76 carbons), methoxy mycolic acids(76 carbons), ω1-methoxy mycolic acids (76 carbons), α mycolic acids (78carbons), keto mycolic acids (77 carbons), ω1-methoxy mycolic acids (77carbons), methoxy mycolic acids (77 carbons), ω1-methoxy mycolic acids(77 carbons), ω1-methoxy mycolic acids (77 carbons), α mycolic acids (79carbons), keto mycolic acids (78 carbons), ω1-methoxy mycolic acids (78carbons), methoxy mycolic acids (78 carbons), ω1-methoxy mycolic acids(78 carbons), α mycolic acids (80 carbons), keto mycolic acids (79carbons), ω1-methoxy mycolic acids (79 carbons), methoxy mycolic acids(79 carbons), ω1-methoxy mycolic acids (79 carbons), α mycolic acids (81carbons), keto mycolic acids (80 carbons), ω1-methoxy mycolic acids (80carbons), methoxy mycolic acids (80 carbons), ω1-methoxy mycolic acids(81 carbons), α mycolic acids (82 carbons), keto mycolic acids (81carbons), ω1-methoxy mycolic acids (81 carbons), methoxy mycolic acids(81 carbons), ω1-methoxy mycolic acids (82 carbons), α mycolic acids (83carbons), keto mycolic acids (82 carbons), ω1-methoxy mycolic acids (82carbons), methoxy mycolic acids (82 carbons), ω1-methoxy mycolic acids(83 carbons), ω1-methoxy mycolic acids (83 carbons), α mycolic acids (84carbons), keto mycolic acids (83 carbons), ω1-methoxy mycolic acids (83carbons), methoxy mycolic acids (83 carbons), ω1-methoxy mycolic acids(84 carbons), α mycolic acids (85 carbons), keto mycolic acids (84carbons), ω1-methoxy mycolic acids (84 carbons), methoxy mycolic acids(84 carbons), ω1-methoxy mycolic acids (85 carbons), ω1-methoxy mycolicacids (85 carbons), α mycolic acids (86 carbons), keto mycolic acids (85carbons), ω1-methoxy mycolic acids (85 carbons), methoxy mycolic acids(85 carbons), ω1-methoxy mycolic acids (86 carbons), α mycolic acids (87carbons), keto mycolic acids (86 carbons), ω1-methoxy mycolic acids (86carbons), methoxy mycolic acids (86 carbons), ω1-methoxy mycolic acids(87 carbons), ω1-methoxy mycolic acids (87 carbons), α mycolic acids (88carbons), keto mycolic acids (87 carbons), ω1-methoxy mycolic acids (87carbons), methoxy mycolic acids (87 carbons), ω1-methoxy mycolic acids(88 carbons), α mycolic acids (89 carbons), keto mycolic acids (88carbons), ω1-methoxy mycolic acids (88 carbons), methoxy mycolic acids(88 carbons), ω1-methoxy mycolic acids (89 carbons), α mycolic acids (90carbons), keto mycolic acids (89 carbons), ω1-methoxy mycolic acids (89carbons), methoxy mycolic acids (89 carbons), α mycolic acids (91carbons), keto mycolic acids (90 carbons), ω1-methoxy mycolic acids (90carbons), methoxy mycolic acids (90 carbons), ω1-methoxy mycolic acids(91 carbons), methoxy mycolic acids (91 carbons), ω1-methoxy mycolicacids (92 carbons), and ω1-methoxy mycolic acids (93 carbons).

We have identified distinct variants of mycoloyl-tuberculosinyladenosinethat are made by Mycobacterium tuberculosis, each of which are useful asmarkers of infection with the bacteria, e.g. the distinct variants aremycolates of either alpha, methoxy and keto forms.

Formula IV is merely a representative structure of one of themycoloyl-tuberculosinyladenosines that are useful in the methods of theinvention.

As used herein, Formula IV refers to a mycoloyl-tuberculosinyladenosineof Formula III having the following chemical structure:

We have identified molecular variants ofmycoloyl-tuberculosinyladenosine that are made by Mycobacteriumtuberculosis, each of which are useful as markers of infection with thebacteria, See e.g. masses in Table 1. Mycolic acids produced by TB aredescribed in e.g. C. Barry et al., (1998) Mycolic acids: structurebiosysnthesis and physiological functions, Progress and Research 37:143, which is herein incorporated by reference in its entirety.

TABLE 1 Mycoloylated TbAd Alkyl formula length C H O N MW M + H M + Na53 83 145 8 5 1340.1093 1341.1166 1363.0385 56 86 153 6 5 1352.18211353.1893 1375.1113 54 84 147 8 5 1354.1249 1355.1322 1377.0542 55 85149 8 5 1368.1406 1369.1479 1391.0698 58 88 157 6 5 1380.2134 1381.22061403.1426 56 86 151 8 5 1382.1562 1383.1635 1405.0855 57 87 153 8 51396.1719 1397.1792 1419.1011 60 90 161 6 5 1408.2447 1409.25191431.1739 58 88 155 8 5 1410.1875 1411.1948 1433.1168 59 89 157 8 51424.2032 1425.2105 1447.1324 62 92 165 6 5 1436.2760 1437.28321459.2052 60 90 159 8 5 1438.2188 1439.2261 1461.1481 61 91 161 8 51452.2345 1453.2418 1475.1637 64 94 169 6 5 1464.3073 1465.31451487.2365 62 92 163 8 5 1466.2501 1467.2574 1489.1794 63 93 165 8 51480.2658 1481.2731 1503.1950 66 96 173 6 5 1492.3386 1493.34581515.2678 64 94 167 8 5 1494.2814 1495.2887 1517.2107 65 95 169 8 51508.2971 1509.3044 1531.2263 68 98 177 6 5 1520.3699 1521.37711543.2991 66 96 171 8 5 1522.3127 1523.3200 1545.2420 69 99 177 6 51532.3699 1533.3771 1555.2991 67 97 173 8 5 1536.3284 1537.33571559.2576 70 100 179 6 5 1546.3855 1547.3928 1569.3147 68 98 175 8 51550.3440 1551.3513 1573.2733 71 101 181 6 5 1560.4012 1561.40841583.3304 69 99 177 8 5 1564.3597 1565.3670 1587.2889 72 102 183 6 51574.4168 1575.4241 1597.3460 71 101 189 7 5 1584.4587 1585.46601607.3879 73 103 185 6 5 1588.4325 1589.4397 1611.3617 72 102 183 7 51590.4117 1591.4190 1613.3410 72 102 183 7 5 1590.4117 1591.41901613.3410 72 102 191 7 5 1598.4743 1599.4816 1621.4036 74 104 187 6 51602.4481 1603.4554 1625.3773 73 103 185 7 5 1604.4274 1605.43471627.3566 73 103 185 7 5 1604.4274 1605.4347 1627.3566 73 103 185 7 51604.4274 1605.4347 1627.3566 73 103 187 7 5 1606.4430 1607.45031629.3723 73 103 191 7 5 1610.4743 1611.4816 1633.4036 73 103 193 7 51612.4900 1613.4973 1635.4192 75 105 189 6 5 1616.4638 1617.47101639.3930 74 104 187 7 5 1618.4430 1619.4503 1641.3723 74 104 187 7 51618.4430 1619.4503 1641.3723 74 104 187 7 5 1618.4430 1619.45031641.3723 74 104 189 7 5 1620.4587 1621.4660 1643.3879 74 104 193 7 51624.4900 1625.4973 1647.4192 76 106 191 6 5 1630.4794 1631.48671653.4086 75 105 189 7 5 1632.4587 1633.4660 1655.3879 75 105 189 7 51632.4587 1633.4660 1655.3879 75 105 189 7 5 1632.4587 1633.46601655.3879 75 105 191 7 5 1634.4743 1635.4816 1657.4036 75 105 193 7 51636.4900 1637.4973 1659.4192 75 105 195 7 5 1638.5056 1639.51291661.4349 77 107 193 6 5 1644.4951 1645.5023 1667.4243 76 106 191 7 51646.4743 1647.4816 1669.4036 76 106 191 7 5 1646.4743 1647.48161669.4036 76 106 191 7 5 1646.4743 1647.4816 1669.4036 76 106 193 7 51648.4900 1649.4973 1671.4192 76 106 195 7 5 1650.5056 1651.51291673.4349 78 108 195 6 5 1658.5107 1659.5180 1681.4399 77 107 193 7 51660.4900 1661.4973 1683.4192 77 107 193 7 5 1660.4900 1661.49731683.4192 77 107 193 7 5 1660.4900 1661.4973 1683.4192 77 107 195 7 51662.5056 1663.5129 1685.4349 77 107 195 7 5 1662.5056 1663.51291685.4349 77 107 197 7 5 1664.5213 1665.5286 1687.4505 79 109 197 6 51672.5264 1673.5336 1695.4556 78 108 195 7 5 1674.5056 1675.51291697.4349 78 108 195 7 5 1674.5056 1675.5129 1697.4349 78 108 195 7 51674.5056 1675.5129 1697.4349 78 108 197 7 5 1676.5213 1677.52861699.4505 78 108 197 7 5 1676.5213 1677.5286 1699.4505 80 110 199 6 51686.5420 1687.5493 1709.4712 79 109 197 7 5 1688.5213 1689.52861711.4505 79 109 197 7 5 1688.5213 1689.5286 1711.4505 79 109 197 7 51688.5213 1689.5286 1711.4505 79 109 199 7 5 1690.5369 1691.54421713.4662 79 109 199 7 5 1690.5369 1691.5442 1713.4662 81 111 201 6 51700.5577 1701.5649 1723.4869 80 110 199 7 5 1702.5369 1703.54421725.4662 80 110 199 7 5 1702.5369 1703.5442 1725.4662 80 110 199 7 51702.5369 1703.5442 1725.4662 80 110 201 7 5 1704.5526 1705.55991727.4818 81 111 199 7 5 1714.5369 1715.5442 1737.4662 82 112 203 6 51714.5733 1715.5806 1737.5025 81 111 201 7 5 1716.5526 1717.55991739.4818 81 111 201 7 5 1716.5526 1717.5599 1739.4818 81 111 201 7 51716.5526 1717.5599 1739.4818 81 111 203 7 5 1718.5682 1719.57551741.4975 82 112 201 7 5 1728.5526 1729.5599 1751.4818 83 113 205 6 51728.5890 1729.5962 1751.5182 82 112 203 7 5 1730.5682 1731.57551753.4975 82 112 203 7 5 1730.5682 1731.5755 1753.4975 82 112 203 7 51730.5682 1731.5755 1753.4975 82 112 205 7 5 1732.5839 1733.59121755.5131 83 113 201 7 5 1740.5526 1741.5599 1763.4818 83 113 203 7 51742.5682 1743.5755 1765.4975 84 114 207 6 5 1742.6046 1743.61191765.5338 83 113 205 7 5 1744.5839 1745.5912 1767.5131 83 113 205 7 51744.5839 1745.5912 1767.5131 83 113 205 7 5 1744.5839 1745.59121767.5131 83 113 207 7 5 1746.5995 1747.6068 1769.5288 84 114 203 7 51754.5682 1755.5755 1777.4975 85 115 209 6 5 1756.6203 1757.62751779.5495 84 114 207 7 5 1758.5995 1759.6068 1781.5288 84 114 207 7 51758.5995 1759.6068 1781.5288 84 114 207 7 5 1758.5995 1759.60681781.5288 84 114 209 7 5 1760.6152 1761.6225 1783.5444 85 115 203 7 51766.5682 1767.5755 1789.4975 85 115 205 7 5 1768.5839 1769.59121791.5131 86 116 211 6 5 1770.6359 1771.6432 1793.5651 85 115 209 7 51772.6152 1773.6225 1795.5444 85 115 209 7 5 1772.6152 1773.62251795.5444 85 115 209 7 5 1772.6152 1773.6225 1795.5444 85 115 211 7 51774.6308 1775.6381 1797.5601 86 116 205 7 5 1780.5839 1781.59121803.5131 87 117 213 6 5 1784.6516 1785.6588 1807.5808 86 116 211 7 51786.6308 1787.6381 1809.5601 86 116 211 7 5 1786.6308 1787.63811809.5601 86 116 211 7 5 1786.6308 1787.6381 1809.5601 86 116 213 7 51788.6465 1789.6538 1811.5757 87 117 205 7 5 1792.5839 1793.59121815.5131 87 117 207 7 5 1794.5995 1795.6068 1817.5288 88 118 215 6 51798.6672 1799.6745 1821.5964 87 117 213 7 5 1800.6465 1801.65381823.5757 87 117 213 7 5 1800.6465 1801.6538 1823.5757 87 117 213 7 51800.6465 1801.6538 1823.5757 87 117 215 7 5 1802.6621 1803.66941825.5914 88 118 207 7 5 1806.5995 1807.6068 1829.5288 89 119 217 6 51812.6829 1813.6901 1835.6121 88 118 215 7 5 1814.6621 1815.66941837.5914 88 118 215 7 5 1814.6621 1815.6694 1837.5914 88 118 215 7 51814.6621 1815.6694 1837.5914 88 118 217 7 5 1816.6778 1817.68511839.6070 89 119 209 7 5 1820.6152 1821.6225 1843.5444 90 120 219 6 51826.6985 1827.7058 1849.6277 89 119 217 7 5 1828.6778 1829.68511851.6070 89 119 217 7 5 1828.6778 1829.6851 1851.6070 89 119 217 7 51828.6778 1829.6851 1851.6070 89 119 219 7 5 1830.6934 1831.70071853.6227 91 121 221 6 5 1840.7142 1841.7214 1863.6434 90 120 219 7 51842.6934 1843.7007 1865.6227 90 120 219 7 5 1842.6934 1843.70071865.6227 90 120 219 7 5 1842.6934 1843.7007 1865.6227 90 120 221 7 51844.7091 1845.7164 1867.6383 91 121 221 7 5 1856.7091 1857.71641879.6383 91 121 221 7 5 1856.7091 1857.7164 1879.6383 91 121 223 7 51858.7247 1859.7320 1881.6540 92 122 223 7 5 1870.7247 1871.73201893.6540 93 123 225 7 5 1884.7404 1885.7477 1907.6696

In certain embodiments, the compounds of Formula I-IV further comprisean acetyl group and/or fatty acid group, and such compounds are detectedas a measure of the subject having TB infection.

Biological Samples

In methods, systems, and assays of embodiments of the invention, thebiological samples (test samples) are tested to determine the presenceor absence of one or more compounds (i.e. the compounds of Formula I-IV)that are indicative of M. tuberculosis being present in the sample, andthus are indicative that the subject is infected with M. tuberculosis.

Any biological sample that is derived from a subject can be used inmethods of the invention. In certain embodiments, the biological sampleis selected from the group consisting of: breath, sputum, blood, urine,gastric lavage, and pleural fluid. The biological sample can also be asample selected from the group consisting of: lung tissue, lymphoidtissue e.g. associated with the lung, paranasal sinuses, bronchi, abronchiole, alveolus, ciliated mucosal epithelia of the respiratorytract, mucosal epithelia of the respiratory tract, squamous epithelialcells of the respiratory tract, a mast cell, a goblet cell, a pneumocyte(type 1 or type 2), broncheoalveolar lavage fluid (BAL), alveolar liningfluid, an intra epithelial dendritic cell, sputum, mucus, saliva, blood,serum, plasma, a peripheral blood mononuclear cell (PBMC), a neutrophiland a monocyte.

Samples can be collected as a solid, liquid, and/or as a gas form.Methods of sample collection are well known to those of skill in theart. In one embodiment, the biological sample is obtained from a subjectby a method selected from the group consisting of surgery or otherexcision method, aspiration of a body fluid such as hypertonic saline orpropylene glycol, broncheoalveolar lavage, bronchoscopy, salivacollection with a glass tube, salivette (Sarstedt A G, Sevelen,Switzerland), Ora-sure (Epitope Technologies Pty Ltd, Melbourne,Victoria, Australia), omni-sal (Saliva Diagnostic Systems, Brooklyn,N.Y., USA), collection of gaseous material, and blood collection, e.g.by use of a syringe. Methods of collection of plasma are also describedin Gershman, N. H. et al, J Allergy Clin Immunol, 10(4): 322-328, 1999.

In certain embodiments, the biological sample is treated to lyse thecells in the sample. Such methods include, e.g., the use of detergents,enzymes, repeatedly freezing and thawing said cells, sonication and/orvortexing the cells in the presence of glass beads, amongst others.

In another embodiment, the biological sample is treated to denatureproteins or extract lipids. Methods of denaturing proteins are wellknown to those of skill in the art and include, e.g. heating a sample,treatment with 2-mercaptoethanol, or treatment with detergents and othercompounds such as, for example, guanidinium or urea. In yet anotherembodiment, a biological sample is treated to concentrate a protein issaid sample. Methods of concentrating proteins include precipitation,freeze drying, use of funnel tube gels (TerBush and Novick, Journal ofBiomolecular Techniques, 10(3); 1999), ultrafiltration or dialysis.Methods of extracting lipids are well known to those of skill in the artand include, e.g. treating with chloroform, methanol and other organicsolvents.

The sample can be analyzed directly for the one or more compounds(Formulas I-IV). Alternatively, the sample can be cultured in a suitablegrowth medium to allow growth and metabolism of bacteria in the sample.The bacteria in a sample may be grown in media or in culture. Samplescan be cultured for any amount of time that allows for propagation ofbacteria. For example, samples may be cultured for less than 2 hours,2-4 hours, 4-6 hours, 6-10 hours, more than 10 hours or more than 24hours. The culture may include any known bacterial culturing media, forexample glucose, lipids, short-chain fatty acids, etc., such aspropionate, cholesterol, and/or palmitate, or sodium propionate.

In certain embodiments, the methods of the invention, which determinethe presence or absence of the compound/s of Formula I-IV in thebiological samples, comprise the step of comparing the data of thebiological sample obtained from the subject (i.e. the test sample) witha data from a reference sample. As one of skill in the art is aware,such comparison can remove background noise in any assay ofdetermination.

In one embodiment, the reference sample is a biological sample of thesame type from a subject that does not have Mycobacterium tuberculosisinfection. The subject can be determined not to have infection usingestablished Mycobacterium tuberculosis diagnostics, and confirmed byculturing. For example, sputum smears and cultures can be done foracid-fast bacilli by using fluorescence microscopy (auramine-rhodaminestaining), which is more sensitive than conventional Ziehl-Neelsenstaining (See e.g. Kumar, Vinay; et al. (2007) Robbins Basic Pathology(8th ed.) Saunders Elsevier. pp. 516-522; Burke and Parnell. MinimalPulmonary Tuberculosis. 1948. 59:348 Canadian Medical AssociationJournal; and Steingart K, Henry M, Ng V, et al. (2006) “Fluorescenceversus conventional sputum smear microscopy for tuberculosis: asystematic review”. Lancet Infect Dis 6 (9): 570-81). In cases wherethere is no spontaneous sputum production, a sample can be induced,usually by nebulized inhalation of a saline or saline withbronchodilator solution. A comparative study found that inducing threesputum samples is more sensitive than three gastric washings (Brown M,Varia H, Bassett P, Davidson R N, Wall R, Pasvol G (2007). “Prospectivestudy of sputum induction, gastric washing, and bronchoalveolar lavagefor the diagnosis of pulmonary tuberculosis in patients who are unableto expectorate”. Clin Infect Dis 44 (11): 1415-20).

In one embodiment of the invention, the reference sample and the test(or subject) sample are both processed, and assayed in the same manner.The data obtained for the reference sample and the test sample are thencompared. In one embodiment, the reference sample and the test sampleare processed, analyzed or assayed at the same time. In anotherembodiment, the reference sample and the test sample are processed,analyzed or assayed at a different times.

In an alternate embodiment, the reference sample is derived from anestablished data set that has been previously generated, also known asreference data. In one embodiment, the reference data is obtained from asingle biological sample of the same type from a subject that does nothave Mycobacterium tuberculosis infection. In one embodiment, thereference sample comprises data from a sample population study ofindividuals that do not have TB infection, such as, for example,statistically significant data of background ranges of compound data.Data derived from processing, analyzing or assaying the test sample isthen compared to data obtained for the sample population that does nothave M. tuberculosis. Reference data is obtained from a sufficientlylarge number of reference samples so as to be representative of apopulation and allows for the generation of a data set for determiningthe average level of any particular parameter.

As used herein, the term “statistically significant” or “significantly”refers to statistical significance and generally means a two standarddeviation (2SD) or greater difference.

In certain aspects, methods are provided for determining if a subject isresponsive to a treatment for M. tuberculosis. In such aspects, in someembodiments, the concentration of one or more compounds of Formula I-IVare determined from measuring the amount of the compound/s in abiological sample taken from a subject at a time point before treatment,and then are compared to data obtained from a biological sample from thesame subject after treatment. Alternatively, a biological sample istaken at the time of treatment, and another thereafter a period of time,e.g. after two days, three days, for days, or more after treatment. Adecrease in the amount of one or more compounds of Formula I-IVindicates that the treatment for TB infection is working to reduce thebacterial load of TB.

Detection of Compounds

There are many methods available to those of skill in the art fordetection/measurement of the compounds described herein that arespecific M. tuberculosis, (i.e. the compounds of Formula I-IV).Non-limiting examples include for example mass spectrometry (MS),nuclear magnetic resonance spectroscopy, and an immunoassay. Forexample, high performance liquid chromatography mass spectrometry(HPLC-MS) or collision induced mass spectrometry (CID-MS), MALDI/TOF(time-of-flight), SELDI/TOF, liquid chromatography-mass spectrometry(LC-MS), gas chromatography-mass spectrometry (GC-MS), capillaryelectrophoresis-mass spectrometry , nuclear magnetic resonancespectrometry, or tandem mass spectrometry (e.g., MS/MS, MS/MS/MS,ESI-MS/MS, etc.). See for example, U.S. Patent Application Nos:20030199001, 20030134304, 20030077616, which are herein incorporated byreference. Mass spectrometry methods are well known in the art (see,e.g., Li et al. (2000) Tibtech 18:151-160; Rowley et al. (2000) Methods20: 383-397; and Kuster and Mann (1998) Curr. Opin. Structural Biol. 8:393-400; Chait et al., Science 262:89-92 (1993); Keough et al., Proc.Natl. Acad. Sci. USA. 96:7131-6 (1999); reviewed in Bergman, EXS88:133-44 (2000)). For additional information regarding massspectrometers, see, e.g., Principles of Instrumental Analysis, 3rdedition., Skoog, Saunders College Publishing, Philadelphia, 1985; andKirk-Othmer Encyclopedia of Chemical Technology, 4.sup.third ed. Vol.15, John Wiley & Sons, New York 1995, pp. 1071-1094. Software programssuch as the Biomarker Wizard program (Ciphergen Biosystems, Inc.,Fremont, Calif.) can be used to aid in analyzing mass spectra, e.g.,comparing the signal strength of peak values from spectra of a testsubject sample and a control sample (e.g., a normal healthy person nothaving a compound of Formula I-IV, or in the alternative a positivecontrol having the compound/s).

In one embodiment, liquid chromatography/mass spectrometry (LC/MS) isused for detection of bacterial compounds I-IV, for example, LC/MS datafiles can be processed with the MassHunter Qualitative Analysis Softwareversion B.02.00 (Agilant technologies, Santa Clara, Calif.).

In certain embodiments, a gas phase ion spectrophotometer is used. Inother embodiments, laser-desorption/ionization mass spectrometry is usedto analyze the sample. Modern laser desorption/ionization massspectrometry (“LDI-MS”) can be practiced in two main variations: matrixassisted laser desorption/ionization (“MALDI”) mass spectrometry andsurface-enhanced laser desorption/ionization (“SELDI”). In MALDI, theanalyte is mixed with a solution containing a matrix, and a drop of theliquid is placed on the surface of a substrate. The matrix solution thenco-crystallizes with the biological molecules. The substrate is insertedinto the mass spectrometer. Laser energy is directed to the substratesurface where it desorbs and ionizes the biological molecules withoutsignificantly fragmenting them. In SELDI, the substrate surface ismodified so that it is an active participant in the desorption process.In one variant, the surface is derivatized with adsorbent and/or capturereagents that selectively bind the compound of interest. In anothervariant, the surface is derivatized with energy absorbing molecules thatare not desorbed when struck with the laser. In another variant, thesurface is derivatized with molecules that bind the compound of interestand that contain a photolytic bond that is broken upon application ofthe laser. In each of these methods, the derivatizing agent generally islocalized to a specific location on the substrate surface where thesample is applied. See, e.g., U.S. Pat. No. 5,719,060 and WO 98/59361.The two methods can be combined by, for example, using a SELDI affinitysurface to capture an analyte and adding matrix-containing liquid to thecaptured analyte to provide the energy absorbing material. Foradditional information regarding mass spectrometers, see, e.g.,Principles of Instrumental Analysis, 3rd edition., Skoog, SaundersCollege Publishing, Philadelphia, 1985; and Kirk-Othmer Encyclopedia ofChemical Technology, 4.sup.th ed. Vol. 15 (John Wiley & Sons, New York1995), pp. 1071-1094.

Detection of the presence of one compounds of Formula I-IV willtypically involve detection of signal intensity. This, in turn, canreflect the quantity. For example, in certain embodiments, the signalstrength of peak values from spectra of a first sample and a secondsample can be compared (e.g., visually, by computer analysis etc.), todetermine the relative amounts of particular compounds. Softwareprograms such as the Biomarker Wizard program (Ciphergen Biosystems,Inc., Fremont, Calif.) can be used to aid in analyzing mass spectra. Themass spectrometers and their techniques are well known to those of skillin the art. Any person skilled in the art understands, any of thecomponents of a mass spectrometer (e.g., desorption source, massanalyzer, detect, etc.) and varied sample preparations can be combinedwith other suitable components or preparations described herein, or tothose known in the art. For example, in some embodiments a controlsample may contain heavy atoms (e.g. ¹³C) thereby permitting the testsample to mixed with the known control sample in the same massspectrometry run.

In one embodiment, a laser desorption time-of-flight (TOF) massspectrometer is used. In laser desorption mass spectrometry , asubstrate with a bound marker is introduced into an inlet system. Themarker is desorbed and ionized into the gas phase by laser from theionization source. The ions generated are collected by an ion opticassembly, and then in a time-of-flight mass analyzer, ions areaccelerated through a short high voltage field and let drift into a highvacuum chamber. At the far end of the high vacuum chamber, theaccelerated ions strike a sensitive detector surface at a differenttime. Since the time-of-flight is a function of the mass of the ions,the elapsed time between ion formation and ion detector impact can beused to identify the presence or absence of molecules of specific massto charge ratio. In some embodiments the relative amounts of one or morecompounds present in a first or second sample is determined, in part, byexecuting an algorithm with a programmable digital computer. Thealgorithm identifies at least one peak value in the first mass spectrumand the second mass spectrum. The algorithm then compares the signalstrength of the peak value of the first mass spectrum to the signalstrength of the peak value of the second mass spectrum of the massspectrum. The relative signal strengths are an indication of the amountof the biomolecule that is present in the first and second samples. Astandard containing a known amount of a biomolecule can be analyzed asthe second sample to provide better quantify the amount of thebiomolecule present in the first sample. In certain embodiments, theidentity of the biomolecules in the first and second sample can also bedetermined.

In one embodiment, the presence of one or more compounds of Formula I-IVis detected by determining the presence of host antibodies directedagainst the compound/s, e.g. by an immunoassay. The compounds of FormulaI-IV are not normally present in individuals that are not infected withMycobacterium tuberculosis, thus the compounds are antigens, andantibodies that bind to these compounds are generated by the host.

In one embodiment, the immunoassay used is similar to the establishedPPD test for Mycobacterium tuberculosis (See e.g. Von Reyn CF1 et al.(2001) Int J Tuberc Lung Disease Dec;5(12):1122-8.). The PPD tests forthe presence of host antibodies against PPD. If the patient has TB theywill exhibit a positive reaction to the injected PPD. In embodiments ofthe methods of the invention, the presence of host antibodies directedagainst one or more compounds of Formula I-IV can be similarly tested.For example, one or more compounds of Formula I-IV can be administeredto a subject, e.g. injected under the first layer of skin. The subjectcan then be monitored for an immune reaction to the one or morecompounds of Formula I-IV, wherein a positive immune reaction after 48to 72 hours, indicates that the subject is infected with Mycobacteriumtuberculosis. The positive immune reaction occurs because the compound/swere present before administration of the test (before injection of thecompound/s), and the subject had already hosted an immune reactionagainst those compounds.

A common immunoassay is the “Enzyme-Linked Immunosorbent Assay (ELISA).”There are different forms of ELISA, which are well known to thoseskilled in the art. The standard techniques e.g. are described in“Methods in Immunodiagnosis”, 2nd Edition, Rose and Bigazzi, eds. JohnWiley & Sons, 1980; Campbell et al., “Methods and Immunology”, W. A.Benjamin, Inc., 1964; and Oellerich, M. 1984, J. Clin. Chem. Clin.Biochem., 22:895-904.

In another aspect of the invention, an immunoassay, (e.g. an ELISA orother assay) is performed to measure the presence of one or morecompounds of Formula I-IV, wherein an antibody that specifically bindsto the compound/s is used to directly detect the compound in abiological sample from a subject. As a non-limiting example, an antibodythat binds to a compound of Formula I-IV can be conjugated to a solidsupport to serve as a ‘capture antibody’ (e.g. tissue culture plate, agel, a membrane, a column, or a bead) and a test biological sampleincubated with the antibody conjugated to the solid support. A compoundthat has bound to the antibody can then be detected using a secondantibody that specifically binds to the compound, optionally a labeledantibody (e.g. sandwich ELISA). Alternatively, the compound/s can beeluted from the capture antibody, and detected by other methods. Forexample, methods including but not limited to, HPLC or MassSpectrometry.

In one embodiment, an ELISA is performed by coating one or morecompounds of Formula I-IV on a tissue culture plate, coating the platewith a blocking agent, such as gelatin or BSA and then incubating thecoated ELISA plate with the biological sample, e.g. blood plasma, orsera, for a sufficient time to allow host antibody to bind thecompound/s. The presence of the bound host antibody is then detected. Inone embodiment, the sample plates are then incubated with an anti-hostantibody (e.g. anti-human antibody), which is optionally detectablylabeled), and the bound antibody detected in an ELISA plate reader.Variants of the assay can be performed, for example by attaching one ormore of the compounds of Formula I-IV on any solid support, e.g. beads,membrane, dipstick, or a column, rather than a tissue culture plate. Thecompounds can alternatively, be linked to the solid support usingchemical linkers, such methods are known to those of skill in the art.

“Labeled antibody”, as used herein, refers to antibodies that arelabeled by a detectable means and include, but are not limited to,antibodies that are enzymatically, radioactively, fluorescently, andchemiluminescently labeled. Antibodies can also be labeled with adetectable tag, such as c-Myc, HA, VSV-G, HSV, FLAG, V5, or HIS.

In certain embodiments, a presence or amount of the Mycobacteriumtuberculosis bacteria in the sample is identified based on the measuredpresence and/or concentration of one of the compounds of Formula I-IVdetected in the sample. In certain embodiments, a presence or amount ofMycobacterium tuberculosis bacteria in a sample is determined based onthe presence or concentration of two or more compounds detected in thesample. In certain embodiments, a presence or amount of Mycobacteriumtuberculosis bacteria in a sample is determined based on the presence orconcentration of three or more compounds detected in the sample. Incertain embodiments, the presence and/or amount of the Mycobacteriumtuberculosis bacteria in a sample is identified at various time points,for example following administration of a therapy, so that a change inbacterial burden can be measured and the efficacy of the therapyidentified.

The concentration of a compound of Formula I-IV is proportional to thenumber of bacteria in a subject. Thus a subject is determined to beresponsive to a therapeutic treatment, if the concentration of acompound of formula I0IV decreases by a statistically significant amountas compared to the concentration of the compound before treatment.

Treatment Regimes

In embodiments of the invention, when individuals are identified ashaving TB, i.e. identified as having one or more compounds of FormulaI-IV, a specialized treatment regime designed for treatment of TB isindicated.

In certain embodiments, the methods further comprise administration of aTB therapeutic when the subject is identified as having TB. For example,positively identified individuals can be administered a therapeuticallyor prophylactically effective amount of one or more agents that inhibitthe replication of M. tuberculosis, or that elicit an immune responseagainst M. tuberculosis.

TB treatment regimes for latent TB infection include, but are notlimited to, administration of one or more of the following TBtherapeutics: isoniazid (INH), rifampin (RIF) and rifapentine (RPT).Subjects with active TB disease are treated by taking a combination oftwo or more of the following therapeutic drugs for 6 to 9 months:isoniazid (INH), ethambutol (EMB), and pyrazinamide (PZA), rifampin(RIF) and rifapentine (RPT). Regimens for treating TB disease usuallyhave an initial phase of 2 months with one or more therapeutics,followed by a choice of combination phase of two or more therapeuticsfor either 4 or 7 months (total of 6 to 9 months for treatment). Thecombination therapy is done to prevent resistance.

TB therapeutics can be classified into 5 groups: Group 1 TB drugs arethe first line of defense and include agents such as oral pyrazinamide,ethambutol, and rifabutin; Group 2 TB drugs are injectable and includeagents such as kanamycin, amikacin, capreomycin, and streptomycin; Group3 TB drugs include the fluoroquinolones such as levofloxacin,moxifloxacin, and ofloxacin; Group 4 TB drugs include the oralbacteriostatic second line of defense agents, such as paraaminosalicylicacid, cycloserineterizidone, thionamide, and protionamid; Group 5 TBdrugs include agents with an unclear role in the treatment of drugresistant TB and include agents such as clofazimine,linezolidamoxicillin/clavulanate and thioacetazoneimipenem/cilas, aswell as at high dose isoniazid, and clarithromycin.

TB therapeutic Isoniazid (Laniazid, Nydrazid), is also known asisonicotinylhydrazine, and is the first-line medication in preventionand treatment of tuberculosis (Hans L Riede (2009), Fourth-generationfluoroquinolones in tuberculosis, Lancet 373 (9670): 1148-1149).Isoniazid is manufactured from isonicotinic acid, which is produced from4-methylpyridine. Isoniazid is available in tablet, syrup, andinjectable forms (given intramuscularly or intravenously).

TB therapeutic rifampin is also known as rifaldazine, RMP, rofact (inCanada), and rifampin in the United States (Masters, Susan B.; et al.(2005), Katzung & Trevor's pharmacology, New York: Lange MedicalBooks/McGraw Hill, Medical Pub. Division). There are various types ofrifamycins. The rifampicin form, with a 4-methyl-1-piperazinaminylgroup, is the most clinically effective.

TB therapeutic Rifapentine was approved by the Food and DrugAdministration (FDA) in June 1998. It is synthesized in one step fromrifampicine (Sharma S K et al. (2013).

Rifamycins (rifampicin, rifabutin and rifapentine) compared to isoniazidfor preventing tuberculosis in HIV-negative people at risk of active TB,Cochrane Database of Systematic Reviews: 7).

TB therapeutic ethambutol is usually given in combination with othertuberculosis drugs, such as isoniazid, rifampicin and pyrazinamide(Yendapally R, Lee R E (2008). “Design, synthesis, and evaluation ofnovel ethambutol analogues”. Bioorg. Med. Chem. Lett. 18 (5): 1607-11).

TB therapeutic pyrazinamide is used in combination with drugs such asisoniazid and rifampicin. Pyrazinamide is used in the first two monthsof treatment to reduce the duration of treatment required (Hong KongChest Service, Medical Research Council (1981) Controlled trial of fourthrice weekly regimens and a daily regimen given for 6 months forpulmonary tuberculosis, Lancet 1(8213): 171-174). Regimens notcontaining pyrazinamide must be taken for nine months or more.

An example dosage regime of a therapeutic TB drug in adults includes,e.g. 5 mg/kg/day (max 300 mg daily of each therapeutic) for 6 months.Dosages may also be given intermittent. For example, for Isoniazid theCenters for Disease Control (CDC) recommends 15 mg/kg/day twice weekly(900 mg max dose), and the World Health Organization (WHO) recommends 10mg/kg/day three times weekly (900 mg max dose) for either 6 or 9 months.When prescribed intermittently (twice or thrice weekly), the dose is10-15 mg/kg (max 900 mg daily), depending on the regimen chosen.Patients with slow clearance of the drug (via acetylation as describedabove) may require reduced dosages to avoid toxicity. The recommendeddosages of the TB therapeutics are well established, and known to thoseof skill in the art.

As used herein, the terms “treat” or “treatment” or “treating” refers toboth therapeutic treatment and prophylactic (i.e. preventative)measures, wherein the object is to prevent or slow the development ofvirulent TB infection. Treatment is generally “effective” if one or moresymptoms or clinical markers of TB are reduced. In one embodiment,“Treatment” includes curing of disease, however, in another embodiment,treatment does not include curing of disease. Treatment can prevent theonset of disease and reduce symptoms, e.g. such they are greatly reducedor such that they are not detectable. For example, treatment is“effective” if the bacterial concentration, is significantly reduced oran increase in growth prevented. Beneficial or desired clinical resultsinclude, but are not limited to, alleviation of one or more symptom(s)of TB, diminishment of extent of TB disease (i.e., not worsening), delayor slowing of TB growth, amelioration or palliation of disease state,and remission.

The term “effective amount” as used herein refers to the amount of apharmaceutical composition, to decrease at least one or more symptoms ofTB, and relates to a sufficient amount of pharmacological composition toprovide the desired effect. The phrase “therapeutically effectiveamount” and “pharmaceutically effective amount” are used interchangeablyand as used herein means a sufficient amount of the composition to treatTB at a reasonable benefit/risk ratio applicable to any medicaltreatment. The term “therapeutically effective amount” therefore refersto an amount of the composition that is sufficient to effect atherapeutically or prophylactically significant reduction in a symptomor clinical marker associated with TB.

In certain embodiments, a therapeutically effective amount reduces thenumber of bacteria in a subject (bacterial load). A therapeutically orprophylactically significant reduction in a symptom or reduction ofbacterial load is, e.g. at least about 10%, at least about 20%, at leastabout 30%, at least about 40%, at least about 50%, at least about 60%,at least about 70%, at least about 80%, at least about 90% in a measuredparameter as compared to a control (e.g. bacterial load and symptomsassessed in the subject before treatment). Measured or measurableparameters include clinically detectable markers of disease, forexample, elevated or depressed levels of a biological marker such asTBAd compounds described herein (i.e. Formula I-IV), as well asparameters related to a clinically accepted scale of symptoms or markersfor a disease or disorder. It should be understood, however, that thetotal daily usage of the compositions and formulations as disclosedherein will be decided by the attending physician within the scope ofsound medical judgment. The exact amount required will vary depending onfactors such as age, weight and severity of TB disease being treated.

The TB therapeutic may be administered by any suitable means. Thecompound suitable for treatment of TB may be contained in anyappropriate amount in any pharmaceutically acceptable carrier substance,and is generally present in an amount of 1-95% by weight of the totalweight of the composition. The composition may be provided in a dosageform that is suitable for the oral, parenteral (e.g., intravenously orintramuscularly), intraperitoneal, rectal, cutaneous, nasal, vaginal,inhalant, skin (patch), or ocular administration route. Thus, thecomposition may be in the form of, e.g., drops, tablets, capsules,pills, powders, granulates, suspensions, emulsions, solutions, gelsincluding hydrogels, pastes, ointments, creams, plasters, drenches,osmotic delivery devices, suppositories, enemas, injectables, implants,sprays, or aerosols. The pharmaceutical compositions suitable fortreatment of TB may be formulated according to conventionalpharmaceutical practice (see, e.g., Remington: The Science and Practiceof Pharmacy, 20th edition, 2000, ed. A. R. Gennaro, Lippincott Williams& Wilkins, Philadelphia, and Encyclopedia of Pharmaceutical Technology,eds. J. Swarbrick and J. C. Boylan, 1988-1999, Marcel Dekker, New York).

The actual amount of the therapeutic compound/s administered will dependupon numerous factors such as the severity of TB infection to betreated, the age and relative health of the subject, the potency of thecompound used, the route and form of administration, and other factors.Therapeutically effective amounts of therapeutic compounds may rangefrom, for example, approximately 0.01-50 mg per kilogram body weight ofthe recipient per day; preferably about 0.1-20 mg/kg/day. Thus, as anexample, for administration to a 70 kg person, the dosage range wouldmost preferably be about 7 mg to 1.4 g per day. The choice offormulation depends on various factors such as the mode of drugadministration (e.g., for oral administration, formulations in the formof tablets, pills, or capsules are preferred) and the bioavailability ofthe drug substance.

Pharmaceutical compositions are comprised of, in general, a therapeuticcompound in combination with at least one pharmaceutically acceptableexcipient. Acceptable excipients are non-toxic, aid administration, anddo not adversely affect the therapeutic benefit of the therapeuticcompound. Such excipients may be any solid, liquid, semi-solid or, inthe case of an aerosol composition, gaseous excipient that is generallyavailable to one skilled in the art.

Solid pharmaceutical excipients include starch, cellulose, talc,glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silicagel, magnesium stearate, sodium stearate, glycerol monostearate, sodiumchloride, dried skim milk and the like. Liquid and semisolid excipientsmay be selected from glycerol, propylene glycol, water, ethanol andvarious oils, including those of petroleum, animal, vegetable orsynthetic origin, e.g., peanut oil, soybean oil, mineral oil, sesameoil, etc. Preferred liquid carriers, particularly for injectablesolutions, include water, saline, aqueous dextrose, and glycols. Othersuitable pharmaceutical excipients and their formulations are describedin Remington's Pharmaceutical Sciences, edited by E. W. Martin (MackPublishing Company, 18th ed., 1990).

Systems and Computer Readable Media

Embodiments of the invention also provide for systems (and computerreadable media for causing computer systems) to perform a method fordetermining whether an individual has been infected with Mycobacteriumtuberculosis.

A system for analyzing a biological sample is provided. The systemcomprises: a) a determination module configured to receive data formmeasuring a compound present in a biological sample of a subjectsuspected of having Mycobacterium tuberculosis infection, wherein thecompound is selected from the group consisting of a compound of FormulaI, Formula II and Formula III, and to optionally determine theconcentration of the compound; b) a storage device configured to storeinformation from the determination module; c) a comparison moduleadapted to compare the data stored on the storage device with referencedata, and to provide a comparison result, wherein the comparison resultidentifies the presence or absence of at least one compound selectedfrom the group consisting of a compound of Formula I, Formula II, andFormula III; and wherein the presence of the at least one compound isindicative that the subject has Mycobacterium tuberculosis infection;and d) a display module for displaying a content based in part on thecomparison result for the user, wherein the content is a signalindicative that the subject has Mycobacterium tuberculosis infection inthe presence of at least one compound of step c), or a signal indicativethat the subject lacks Mycobacterium tuberculosis infection in theabsence of each of the compounds of Formula I, Formula II and FormulaIII. In certain embodiments, the compound of Formula III is representedby Formula IV.

In one embodiment of the system, in step d) the content is a signalindicative that the subject has Mycobacterium tuberculosis infection inthe presence of at least two compounds of step c), or a signalindicative that the subject lacks Mycobacterium tuberculosis infectionin the absence of at least two of the compounds of step c).

In another embodiment of the system, in step d) the content is a signalindicative that the subject has Mycobacterium tuberculosis infection inthe presence of at least three single compounds of step c).

In still another embodiment of the system, the content further comprisesa signal indicating that the subject should be treated for Mycobacteriumtuberculosis in the presence of at least one compound selected from thegroup consisting of Formula I, Formula II, and Formula III.

The invention further provides for a computer readable medium havingcomputer readable instructions recorded thereon to define softwaremodules including a comparison module and a display module forimplementing a method on a computer. The method comprising: a) comparingwith the comparison module the data stored on a storage device withreference data to provide a comparison result, wherein the comparisonresult identifies the presence or absence of at least one compoundselected from the group consisting of a compound of Formula I, FormulaII, and Formula III; and wherein the presence of the at least onecompound is indicative that the subject has Mycobacterium tuberculosisinfection , and b) displaying a content based in part on the comparisonresult for the user, wherein the content is a signal indicative of thatthe subject has Mycobacterium tuberculosis infection in the presence ofat least one compound of step a), or a signal indicative that thesubject lacks Mycobacterium tuberculosis infection in the absence ofeach of the compounds of Formula I, Formula II and Formula III.

In one embodiment of the computer readable medium, in step b) of themethod the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least twocompounds of step c), or a signal indicative that the subject lacksMycobacterium tuberculosis infection in the absence of at least two ofthe compounds of step c).

In one embodiment of the computer readable medium, in step b) of themethod the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least threesingle compounds of step a).

In certain embodiments of the computer readable medium, the compound ofFormula III is represented by Formula IV.

In one embodiment of the computer readable medium, in step b) of themethod, wherein the content further comprises a signal indicating thatthe subject should be treated for Mycobacterium tuberculosis in thepresence of at least one compound selected from the group consisting ofFormula I, Formula II, Formula III and Formula IV.

Embodiments of the invention have been described through functionalmodules, which are defined by computer executable instructions recordedon computer readable media and which cause a computer to perform methodsteps when executed, See FIG. 12 and FIG. 13. The modules have beensegregated by function for the sake of clarity. However, it should beunderstood that the modules need not correspond to discreet blocks ofcode and the described functions can be carried out by the execution ofvarious code portions stored on various media and executed at varioustimes. Furthermore, it should be appreciated that the modules mayperform other functions, thus the modules are not limited to having anyparticular functions or set of functions.

The computer readable media can be any available tangible media that canbe accessed by a computer. Computer readable media includes volatile andnonvolatile, removable and non-removable tangible media implemented inany method or technology for storage of information such as computerreadable instructions, data structures, program modules or other data.Computer readable media includes, but is not limited to, RAM (randomaccess memory), ROM (read only memory), EPROM (erasable programmableread only memory), EEPROM (electrically erasable programmable read onlymemory), flash memory or other memory technology, CD-ROM (compact discread only memory), DVDs (digital versatile disks) or other opticalstorage media, magnetic cassettes, magnetic tape, magnetic disk storageor other magnetic storage media, other types of volatile andnon-volatile memory, and any other tangible medium which can be used tostore the desired information and which can accessed by a computerincluding and any suitable combination of the foregoing.

Computer-readable data embodied on one or more computer-readable media,or computer readable medium 200, may define instructions, for example,as part of one or more programs, that, as a result of being executed bya computer, instruct the computer to perform one or more of thefunctions described herein (e.g., in relation to system 10, or computerreadable medium 200), and/or various embodiments, variations andcombinations thereof. Such instructions may be written in any of aplurality of programming languages, for example, Java, J#, Visual Basic,C, C#, C++, Fortran, Pascal, Eiffel, Basic, COBOL assembly language, andthe like, or any of a variety of combinations thereof. Thecomputer-readable media on which such instructions are embodied mayreside on one or more of the components of either of system 10, orcomputer readable medium 200 described herein, may be distributed acrossone or more of such components, and may be in transition there between.

The computer-readable media may be transportable such that theinstructions stored thereon can be loaded onto any computer resource toimplement the aspects of the present invention discussed herein. Inaddition, it should be appreciated that the instructions stored on thecomputer readable media, or computer-readable medium 200, describedabove, are not limited to instructions embodied as part of anapplication program running on a host computer. Rather, the instructionsmay be embodied as any type of computer code (e.g., software ormicrocode) that can be employed to program a computer to implementaspects of the present invention. The computer executable instructionsmay be written in a suitable computer language or combination of severallanguages. Basic computational biology methods are known to those ofordinary skill in the art and are described in, for example, Setubal andMeidanis et al., Introduction to Computational Biology Methods (PWSPublishing Company, Boston, 1997); Salzberg, Searles, Kasif, (Ed.),Computational Methods in Molecular Biology, (Elsevier, Amsterdam, 1998);Rashidi and Buehler, Bioinformatics Basics: Application in BiologicalScience and Medicine (CRC Press, London, 2000) and Ouelette and BzevanisBioinformatics: A Practical Guide for Analysis of Gene and Proteins(Wiley & Sons, Inc., 2^(nd) ed., 2001).

The functional modules of certain embodiments of the invention include adetermination module, a storage device, a comparison module and adisplay module See FIG. 12 and FIG. 13. The functional modules can beexecuted on one, or multiple, computers, or by using one, or multiple,computer networks. The determination module 40 has computer executableinstructions to provide compound information in computer readable form.As used herein, “compound information” refers to data representative ofthe presence or absence of one or more compounds of Formula I-IV, e.g.this can include but is not limited to data from a mass spectrometer,NMR, or fluorescence meter, e.g. an ELISA plate reader etc. Fornon-limiting examples, compound information can be presented as ionchromatograms, or e.g. as a positive or negative fluorescent signal(e.g. from an ELISA, or other assay) indicating the presence or absenceof the compound respectively. Moreover, information “related to” thecompound can include information that includes detection of the presenceor absence of particular mycolic acids, determination of theconcentration of the compound in the sample (e.g., as a measure ofbacterial load), and the like.

As an example, determination modules 40 for determining compoundinformation may include known systems for automated analysis massspectrometry data including but not limited to software optionsavailable from AB SCIEX, Framingham, MA such as: Analyst® which issoftware that automates MS to MS/MS acquisition withInformation-Dependent Acquisition (IDA) mode, the Scheduled MRM™Algorithm uses overlapping MRM monitoring periods to maximizequantitative performance and accuracy; BioPharmaView™; Cliquid®, asoftware for routine screening and quantitation provides a simple,four-step workflow for LC/MS/MS analysis and bi-directional LIMScompatibility with any LIMS or LIS; DiscoveryQuant™ a software thatimproves the speed of analysis and information gathering, an optimizedmodel performs a rapid, single-injection compound optimization on everycompound using a unique MRM-based approach and then populates a databasewith this information; LightSight®, a software for metaboliteidentification; LipidView™, a software that streamlines the molecularcharacterization and quantification of lipid species from electrosprayMS data; MarkerView™ a software for metabolomics and biomarker profilingacross multiple samples; MetabolitePilot™ a software for TripleTOF®systems which streamlines the detection and identification ofmetabolites; MultiQuant™ a software that processes MRM data forquantitative information with a comprehensive user interface forsuperior data visualization; SignalFinder™ Integration Algorithm, whichallows more reliable integration and less user intervention and extendsthe dynamic range functionality; and PeakView® software, which offers aqualitative review of LC/MS and MS/MS data for the TripleTOF® Systems.In certain embodiments, information gathering involves a fluorescentreadout, non-limiting examples of software that can be used include,e.g. Molecular Dynamics FluorImager™ 575, SI Fluorescent Scanners, andMolecular Dynamics FluorImager™ 595 Fluorescent Scanners (all availablefrom Amersham Biosciences UK Limited, Little Chalfont, Buckinghamshire,England).

Other methods for determining compound information, i.e. determinationmodules 40, include but are not limited to, systems for Matrix AssistedLaser Desorption Ionization—Time of Flight (MALDI-TOF) systems andSELDI-TOF-MS; automated ELISA systems (e.g., DSX® or DS2® (availablefrom Dynax, Chantilly, VA) or the Triturus® (available from Grifols USA,Los Angeles, Calif. ), The Mago® Plus (available from DiamedixCorporation, Miami, Fla.); Densitometers (e.g. X-Rite-508-SpectroDensitometer® (available from RP Imaging™, Tucson, Ariz. ), The HYRYS™ 2HIT densitometer (available from Sebia Electrophoresis, Norcross, Ga.);automated Fluorescence insitu hybridization systems (see for example,U.S. Pat. No. 6,136,540); 2D gel imaging systems coupled with 2-Dimaging software; microplate readers; Fluorescence activated cellsorters (FACS) (e.g. Flow Cytometer FACSVantage SE, (available fromBecton Dickinson, Franklin Lakes, N.J.); and radio isotope analyzers(e.g. scintillation counters).

The compound information determined in the determination module can beread by the storage device 30. As used herein the “storage device” 30 isintended to include any suitable computing or processing apparatus orother device configured or adapted for storing data or information.Examples of electronic apparatus suitable for use with the presentinvention include stand-alone computing apparatus, datatelecommunications networks, including local area networks (LAN), widearea networks (WAN), Internet, Intranet, and Extranet, and local anddistributed computer processing systems. Storage devices 30 alsoinclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage media, magnetic tape, optical storage mediasuch as CD-ROM, DVD, electronic storage media such as RAM, ROM, EPROM,EEPROM and the like, general hard disks and hybrids of these categoriessuch as magnetic/optical storage media. The storage device 30 is adaptedor configured for having recorded thereon compound information orconcentration level information. Such information may be provided indigital form that can be transmitted and read electronically, e.g., viathe Internet, on diskette, via USB (universal serial bus) or via anyother suitable mode of communication.

As used herein, “stored” refers to a process for encoding information onthe storage device 30. Those skilled in the art can readily adopt any ofthe presently known methods for recording information on known media togenerate manufactures comprising the compound information orconcentration level information.

A variety of software programs and formats can be used to store thecompound information or concentration level information on the storagedevice. Any number of data processor structuring formats (e.g., textfile or database) can be employed to obtain or create a medium havingrecorded thereon the compound information or concentration levelinformation.

By providing compound information in computer-readable form, one can usethe information in readable form in the comparison module 80 to comparea specific data profile with the reference data within the storagedevice 30. For example, search programs can be used to identifyfragments or regions of the peaks that match a particular compound ofFormula I-IV (reference data, e.g., compound information obtained from acontrol sample, such as mass spec data, etc. of synthesized compound, ormass spec data of a reference sample). The comparison made incomputer-readable form provides a computer readable comparison resultwhich can be processed by a variety of means, e.g. using the softwaredescribe herein. Content 140 based on the comparison result can beretrieved from the comparison module 80 to indicate infection withMycobacterium tuberculosis; e.g. if one or more compounds is present,then infection is indicated.

In one embodiment the reference data stored in the storage device 30 tobe read by the comparison module 80 is compound information dataobtained from a control biological sample of the same type as thebiological sample to be tested. Alternatively, the reference data aredata from a population of samples. In one embodiment the reference dataare compound data from the assay used for detection, and the data areindicative of TB, i.e. the data show the presence of one or morecompounds of Formula I-IV. Alternatively, the reference data can berepresentative of data found in non-infected individuals and thus, isindicative that one in not infected with TB bacteria.

The “comparison module” 80 can use a variety of available softwareprograms and formats for the comparison operative to compare compoundinformation determined in the determination module 40 to reference data.In one embodiment, the comparison module 80 is configured to use patternrecognition techniques to compare compound information from one or moreentries to one or more reference data patterns. The comparison module 80may be configured using existing commercially-available orfreely-available software for comparing patterns, and may be optimizedfor particular data comparisons that are conducted. The comparisonmodule 80 provides computer readable information related to the compoundinformation that can include, for example, detection of the presence orabsence of particular mycolic acids, information regarding compoundconcentration, e.g. determined by peak height or intensity of signal,e.g. from a fluorescence.

In one embodiment, the comparison module 80 uses compound informationalignment programs such as LipidView™, or PeakView® software, whichoffers a qualitative review of LC/MS and MS/MS data.

The comparison module 80, or any other module of the invention, mayinclude an operating system (e.g., UNIX) on which runs a relationaldatabase management system, a World Wide Web application, and a WorldWide Web server. World Wide Web application includes the executable codenecessary for generation of database language statements (e.g.,Structured Query Language (SQL) statements). Generally, the executablewill include embedded SQL statements. In addition, the World Wide Webapplication may include a configuration file which contains pointers andaddresses to the various software entities that comprise the server aswell as the various external and internal databases which must beaccessed to service user requests. The Configuration file also directsrequests for server resources to the appropriate hardware—as may benecessary should the server be distributed over two or more separatecomputers. In one embodiment, the World Wide Web server supports aTCP/IP protocol. Local networks such as this are sometimes referred toas “Intranets.” An advantage of such Intranets is that they allow easycommunication with public domain databases residing on the World WideWeb (e.g., the GenBank or Swiss Pro World Wide Web site). Thus, in aparticular preferred embodiment of the present invention, users candirectly access data (via Hypertext links for example) residing onInternet databases using a HTML interface provided by Web browsers andWeb servers.

In one embodiment, the comparison module 80 performs comparisons withmass-spectrometry spectra, for example comparisons of peak informationcan be carried out using spectra processed in MATLB with script called“Qcealign” (see for example WO2007/022248, herein incorporated byreference) and “Qpeaks” (Spectrum Square Associates, Ithaca, N.Y.), orCiphergen Peaks 2.1™ software. The processed spectra can then be alignedusing alignment algorithms that align sample data to the control datausing minimum entropy algorithm by taking baseline corrected data (seefor example WIPO Publication WO2007/022248, herein incorporated byreference). The comparison result can be further processed bycalculating ratios. Concentration profiles can be discerned.

In one embodiment of the invention, pattern comparison software is usedto determine whether patterns of expression or mutations are indicativeof a disease.

The comparison module 80 provides computer readable comparison resultthat can be processed in computer readable form by predefined criteria,or criteria defined by a user, to provide a content based in part on thecomparison result that may be stored and output as requested by a userusing a display module 110. The display module 110 enables display of acontent 140 based in part on the comparison result for the user, whereinthe content 140 is a signal indicative of TB infection. Such signal, canbe for example, a display of content 140 indicative of the presence orabsence of a compound of Formula I-IV indicating the presence or absenceof TB infection on a computer monitor, or a printed page of content 140indicating the presence or absence of TB infection from a printer, or alight or sound indicative of the presence or absence of TB infection.

The content 140 based on the comparison result may include an dataprofile of one or more compounds. In one embodiment, the content 140based on the comparison includes a molecular signature of a particularcompound. In one embodiment, the content 140 based on the comparisonresult is merely a signal indicative of the presence or absence ofinfection with TB bacterium (TB infection).

In one embodiment of the invention, the content 140 based on thecomparison result is displayed a on a computer monitor. In oneembodiment of the invention, the content 140 based on the comparisonresult is displayed through printable media. The display module 110 canbe any suitable device configured to receive from a computer and displaycomputer readable information to a user. Non-limiting examples include,for example, general-purpose computers such as those based on IntelPENTIUM-type processor, Motorola PowerPC, Sun UltraSPARC,Hewlett-Packard PA-RISC processors, any of a variety of processorsavailable from Advanced Micro Devices (AMD) of Sunnyvale, Calif., or anyother type of processor, visual display devices such as flat paneldisplays, cathode ray tubes and the like, as well as computer printersof various types.

In one embodiment, a World Wide Web browser is used for providing a userinterface for display of the content 140 based on the comparison result.It should be understood that other modules of the invention can beadapted to have a web browser interface. Through the Web browser, a usermay construct requests for retrieving data from the comparison module.Thus, the user will typically point and click to user interface elementssuch as buttons, pull down menus, scroll bars and the likeconventionally employed in graphical user interfaces. The requests soformulated with the user's Web browser are transmitted to a Webapplication which formats them to produce a query that can be employedto extract the pertinent information related to the compoundinformation, e.g., display of an indication of the presence or absenceof one or more of the compounds of Formula I-IV. In one embodiment, thecompound information of the reference sample data is also displayed.

In one embodiment, the display module 110 displays the comparison resultdata and whether the comparison result is indicative of a disease, e.g.,the data indicates the presence of one or more compounds of FormulaI-IV.

In one embodiment, the content 140 based on the comparison result thatis displayed is a signal (e.g. positive or negative signal) indicativeof the presence or absence of TB infection, thus only a positive ornegative indication may be displayed.

Embodiments of the present invention therefore provide for systems 10(and computer readable medium 200 for causing computer systems) toperform methods for determining whether an individual has Mycobacteriumtuberculosis infection based on data of the compounds of Formula-I-IV,compound information.

System 10, and computer readable medium 200, are merely an illustrativeembodiments of the invention for performing methods of determiningwhether an individual has a specific disease or disorder or apre-disposition, for a specific disease or disorder based on compoundinformation or concentration level of the compound/s, and is notintended to limit the scope of the invention. Variations of system 10,and computer readable medium 200, are possible and are intended to fallwithin the scope of the invention.

The modules of the machine, or used in the computer readable medium, mayassume numerous configurations. For example, function may be provided ona single machine or distributed over multiple machines

Kits

Another aspect of the invention provides a kit for detecting M.tuberculosis infection in a biological sample. In one embodiment, thekit comprises: (i) one or more compounds of Formula I-IV (i.e. theantigen) attached to a solid support, (e.g. a membrane, an ELISA plateor column beads); (ii) an agent that detects the formation of anantigen-antibody complex, e.g. an anti-human antibody, optionallydetectably labeled, and iii) the kit optionally contains one or moreantibodies that bind to a compound of Formula I-IV as a positive controlantibody. Optionally, the kit further comprises compounds/reagents fordetection of a labeled antibody, e.g. for detection of a labeledanti-human antibody. In yet another embodiment, a kit may additionallycomprise a reference sample. Such a reference sample may for example, bea protein sample derived from a biological sample isolated from one ormore tuberculosis subjects. Alternatively, a reference sample maycomprise a biological sample isolated from one or more normal healthyindividuals not infected with Mycobacterium tuberculosis. Such areference sample is optionally included in a kit for a diagnostic orprognostic assay.

Definitions

All patents, patent applications, and publications identified herein areexpressly incorporated herein by reference in their entirety, e.g. forthe purpose of describing and disclosing the methodologies described insuch publications.

For convenience, certain terms employed in the entire application(including the specification, examples, and appended claims) arecollected here. Unless defined otherwise, all technical and scientificterms used herein have the same meaning as commonly understood by one ofordinary skill in the art to which this invention belongs.

It should be understood that this invention is not limited to theparticular methodology, protocols, and reagents, etc., described hereinand as such may vary. The terminology used herein is for the purpose ofdescribing particular embodiments only, and is not intended to limit thescope of the present invention, which is defined solely by the claims.

Other than in the operating examples, or where otherwise indicated, allnumbers expressing quantities of ingredients or reaction conditions usedherein should be understood as modified in all instances by the term“about.” The term “about” when used to described the present invention,in connection with percentages means ±1%.

In one respect, the present invention relates to the herein describedcompositions, methods, and respective component(s) thereof, as essentialto the invention, yet open to the inclusion of unspecified elements,essential or not (“comprising”). In some embodiments, other elements tobe included in the description of the composition, method or respectivecomponent thereof are limited to those that do not materially affect thebasic and novel characteristic(s) of the invention (“consistingessentially of”). This applies equally to steps within a describedmethod as well as compositions and components therein. In otherembodiments, the inventions, compositions, methods, and respectivecomponents thereof, described herein are intended to be exclusive of anyelement not deemed an essential element to the component, composition ormethod (“consisting of”).

A “normal” or “healthy individual”, or “control group” refers toindividuals that do not have infection with Mycobacterium tuberculosiswho are preferably of similar age and race.

As used herein the terms, “individual”, “subject”, “patient”, are usedinterchangeably and are intended to include humans and mammals.

Embodiments of the invention are further described in the followingnumbered paragraphs.

Paragraph 1, A method of identifying Mycobacterium tuberculosis in asubject comprising: measuring the presence or absence of at least onecompound selected from the group consisting of a compound of Formula I(1-tuberculosinyladenosie), Formula II (6-tuberculosinyladenosine) andFormula III (a generic mycoloyl tuberculosinyladenosine), in abiological sample that is derived from a subject suspected of havingMycobacterium tuberculosis infection; wherein the presence of the atleast one compound of step a) is indicative that the subject hasMycobacterium tuberculosis infection.

Paragraph 2, The method of paragraph 1, wherein the presence of the atleast two compounds of step a) is indicative that the subject hasMycobacterium tuberculosis infection.

Paragraph 3, The method of paragraph 1, wherein the presence of the atleast three compounds of step a) is indicative that the subject hasMycobacterium tuberculosis infection.

Paragraph 4, The method of any of paragraphs 1-3, further comprisingadministering to the subject a treatment for Mycobacterium tuberculosis.

Paragraph 5, A method for treatment of Mycobacterium tuberculosiscomprising: administering a pharmaceutically effective amount of aMycobacterium tuberculosis therapeutic to a subject that has thepresence of at least one compound selected from the group consisting ofa compound of Formula I, Formula II and Formula III.

Paragraph 6, The method of paragraph 5, wherein the pharmaceuticallyeffective amount of a Mycobacterium tuberculosis therapeutic isadministered to a subject that has presence of at least two compoundsselected from the group consisting of a compound of Formula I, FormulaII and Formula III.

Paragraph 7, The method of paragraph 5, wherein the pharmaceuticallyeffective amount of a Mycobacterium tuberculosis therapeutic isadministered to a subject that has presence of a compound of Formula I,Formula II and of Formula III.

Paragraph 8, A method for determining if a subject is responsive to aMycobacterium tuberculosis treatment comprising: measuring theconcentration of at least one compound selected from the groupconsisting of a compound of Formula I, Formula II and Formula III, in afirst sample from a subject; administering to the subject a treatmentfor Mycobacterium tuberculosis ; and measuring the concentration of theone or more compounds of step a) in a second sample from the subject,wherein a decrease in concentration of the compound as compared to theconcentration in the first sample is indicative that the subject isresponding the treatment for Mycobacterium tuberculosis and reducinginfection.

Paragraph 9, The method of any of paragraphs 1-8, wherein the compoundis a variant of the compound of Formula III represented by Formula IV(i.e. mycoloyl tuberculosinyladenosine as provided having R groups ofC85 methoxy mycolate and C78 alpha mycolate).

Paragraph 10, The method of any of paragraphs 1-9, wherein the subjectsuspected of having Mycobacterium tuberculosis infection has beendiagnosed as having a bacterial infection.

Paragraph 11, The method of any of paragraphs 1-10, wherein the subjectis human.

Paragraph 12, The method of any of paragraphs 1-11, wherein thebiological sample derived from the subject is selected from the groupconsisting of: breath, sputum, blood, urine, gastric lavage and pleuralfluid.

Paragraph 13, The method of any of paragraphs 1-12, wherein the presenceof the compound is measured using an assay selected from the groupconsisting of: mass spectrometry (MS), nuclear magnetic resonancespectroscopy and an immunoassay. (e.g. high performance liquidchromatography mass spectrometry (HPLC-MS or collision induced massspectrometry (CID-MS)-Immunoassay to detect antibodies against 1-TbAdusing—ELISA, coat).

Paragraph 14, The method of paragraph 13, wherein the assay is animmunoassay that detects the presence of the compound/s by monitoringthe presence of host antibodies directed against the compound/s. (e.g.ELISA)

Paragraph 15, The method of paragraph 13, wherein the assay is animmunoassay that uses a non-host antibody that specifically binds to acompound of Formula I-IV (e.g. a capture antibody).

Paragraph 16, A system for analyzing a biological sample comprising: adetermination module configured to receive data form measuring acompound present in a biological sample of a subject suspected of havingMycobacterium tuberculosis infection, wherein the compound is selectedfrom the group consisting of a compound of Formula I, Formula II andFormula III, and to optionally determine the concentration of thecompound; a storage device configured to store information from thedetermination module; a comparison module adapted to compare the datastored on the storage device with reference data, and to provide acomparison result, wherein the comparison result identifies the presenceor absence of at least one compound selected from the group consistingof a compound of Formula I, Formula II, and Formula III; and wherein thepresence of the at least one compound is indicative that the subject hasMycobacterium tuberculosis infection; and a display module fordisplaying a content based in part on the comparison result for theuser, wherein the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least onecompound of step c), or a signal indicative that the subject lacksMycobacterium tuberculosis infection in the absence of each of thecompounds of Formula I, Formula II and Formula III.

Paragraph 17, The system of paragraph 16, wherein in step d) the contentis a signal indicative that the subject has Mycobacterium tuberculosisinfection in the presence of at least two compounds of step c), or asignal indicative that the subject lacks Mycobacterium tuberculosisinfection in the absence of at least two of the compounds of step c).

Paragraph 18, The system of paragraph 16, wherein in step d) the contentis a signal indicative that the subject has Mycobacterium tuberculosisinfection in the presence of at least three single compounds of step c).

Paragraph 19, The system of any of paragraphs 16-18, wherein the contentfurther comprises a signal indicating that the subject should be treatedfor Mycobacterium tuberculosis in the presence of at least one compoundselected from the group consisting of Formula I, Formula II, and FormulaIII.

Paragraph 20, The system of any of paragraphs 16-19, wherein thecompound of Formula III is represented by Formula IV.

Paragraph 21, The system of any of paragraphs 16-20, wherein thedetermination module is configured to receive data from a MassSpectrometer.

Paragraph 22, The system of any of paragraphs 16-21, wherein the subjectsuspected of having Mycobacterium tuberculosis infection has beendiagnosed as having a bacterial infection.

Paragraph 23, The system of any of paragraphs 16-22, wherein the subjectis human.

Paragraph 24, The system of any of paragraphs 16-23, wherein thebiological sample derived from the subject is selected from the groupconsisting of: breath, sputum, blood, urine, gastric lavage and pleuralfluid.

Paragraph 25, The system of any of paragraphs 16-24, wherein thedetermination module receives data from a mass spectrometer, nuclearmagnetic resonance spectroscopy, high performance liquid chromatography,or an immunoassay (e.g. data from an ELISA plate reader).

Paragraph 26, A computer readable medium having computer readableinstructions recorded thereon to define software modules including acomparison module and a display module for implementing a method on acomputer, said method comprising: comparing with the comparison modulethe data stored on a storage device with reference data to provide acomparison result, wherein the comparison result identifies the presenceor absence of at least one compound selected from the group consistingof a compound of Formula I, Formula II, and Formula III; and wherein thepresence of the at least one compound is indicative that the subject hasMycobacterium tuberculosis infection , and displaying a content based inpart on the comparison result for the user, wherein the content is asignal indicative of that the subject has Mycobacterium tuberculosisinfection in the presence of at least one compound of step a), or asignal indicative that the subject lacks Mycobacterium tuberculosisinfection in the absence of each of the compounds of Formula I, FormulaII and Formula III.

Paragraph 27, The computer readable medium of paragraph 26, wherein instep b) the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least twocompounds of step c), or a signal indicative that the subject lacksMycobacterium tuberculosis infection in the absence of at least two ofthe compounds of step c).

Paragraph 28, The computer readable medium of paragraph 26, wherein instep b) the content is a signal indicative that the subject hasMycobacterium tuberculosis infection in the presence of at least threesingle compounds of step c).

Paragraph 29, The computer readable medium of any of paragraphs 26-28,wherein the compound of Formula III is represented by Formula IV.

Paragraph 30, The computer readable medium of any of paragraphs 26-29,wherein the content further comprises a signal indicating that thesubject should be treated for Mycobacterium tuberculosis in the presenceof at least one compound selected from the group consisting of FormulaI, Formula II, Formula III and Formula IV.

EXAMPLE 1 Identification of Tuberculosinyl Nucleotide Products of theVirulence Associated Enzyme Rv3378c

Methods.

Bacterial culture. Mycobacteria were cultured in triplicate inTween-free Middlebrook 7H9 broth supplemented with 10% Oleic acidAlbumin Dextrose Catalase (Becton Dickinson) in 50 mL polystyrene tubes(Corning) shaking at 100 rpm at 37° C., and a fourth culture was treatedwith TWEEN to disperse bacteria just before taking the OD600measurement. Cultures were harvested when the TWEEN culture replicatereached a 0.6 OD (+/−0.1). Stationary phase cultures of mycobacteriawere cultured similarly but harvested at an OD of 2. Acid stressedcultures were grown in 4.5 pH citrate buffer.

Bacteria were cultured and extracted by chlorofolin/methanol mixtures orethyl acetate, respectively, as described (4, 37). Lipid extracts wereanalyzed using an Agilent 6520 Accurate-Mass Q-Tof and a 1200 seriesHPLC system with a Varian Monochrom diol column (4, 37) with data outputfrom XCMS and MultiplotPreprocess and Multiplot modules of GenePattern(Broad Institute) (38). Rv3378c and GroES/GroEL chaperones werecoexpressed in BL21-CodonPlus™ (Stratagene) cells and purified on aNi-NTA HisTrap™ FF column (GE Healthcare). Purified Rv3378c (10 mg/mL)was crystallized by vapor diffusion and 2.20-Å resolution data werecollected on the Advanced Light Source (ALS). The structure of Rv3378cwas solved by SAD phasing of a mercury derivative using Phenix AutoSol.Enzymatic assays were performed by incubating fifty-six micrograms ofditerpene in presence of thirty-three mg of adenosine (Sigma) and eightymg of purified Rv3378c in 1 mL of pH 7.4 Tris-HCl buffer 4 hr at 37° C.under magnetic agitation. M. tuberculosis transposon mutants from arandom library (25) were grown in 96 well format, heat killed, followedby lipid extraction by 70:30 methanol:isopropanol. Lipids were analyzedby HPLC-MS to monitor 1-TbAd production. 1-TbAd null strains wereconfirmed regrowing the bacteria and using a full lipidomic analysismethod. TbAd was purified from mycobacterial cell-associated lipidextract using normal and reversed phase chromatography. Structures weresolved using CID-MS and NMR spectroscopy using a Bruker Avance 800

Mycobacterial lipid extraction. HPLC-MS grade solvents (Fisher) andclean borosilicate glassware (Fisher), amber vials (Supelco) andTeflon-lined caps (Fisher) were used. Bacterial cultures werecentrifuged (4,000 rpm, 10 min) to clarify culture supernatants, whichwere passed twice through a 0.22 μm filter to remove intact membranefragments (1). Cell pellets were washed twice in 10 mL Optima water,resuspended in 1 mL of CH3OH, transferred to a 50 mL amber glass bottleand contacted with 25 mL CHCl3/CH3OH (2:1, v:v) overnight to sterilizebacteria. CHCl3/CH3OH suspensions were transferred in 50 mL conicalglass tubes and rotated at 20° C. for at least 1 hr. Aftercentrifugation, lipid extracts were decanted, and bacteria pelletssubjected to 2 additional extractions using CHCl3:CH3OH (1:1, v:v) andCHCl3:CH3OH (1:2, v:v) with pooling of extracts and evaporation withGeneVac EZ-2 (SP Scientific) using the low boiling point mixturesetting.

Dried lipids were resuspended in CHCl3:CH3OH (1:1, v:v) and dried undernitrogen in preweighed vials and then reweighed in triplicate onmicrobalance (Mettler Toledo, XP205). Then extracts were redissolved inCHCl3:CH3OH (1:1, v:v) at 1 mg/mL.

HPLC-ESI-QTof based Lipidomics. Using an Agilent Technologies 6520Accurate-Mass Q-Tof and a 1200 series HPLC system with a VarianMonochrom diol column (3 μm×150 mm×2 mm) and a Varian Monochrom diolguard column (3 μm×4.6 mm), normal phase lipidomics was carried out asdescribed (2). Total lipid extracts were resuspended at 0.5 mg/mL insolvent A (hexanes:isopropanol, 70:30 [v:v], 0.02% [m/v] formic acid,0.01% [m/v] ammonium hydroxide), filtered or centrifuged at 1,500 rpmfor 5 min to remove trace non-lipidic materials prior to transfer to aglass autosampler vial (Agilent). Ten μg of lipid was injected, and thecolumn (20° C.) was eluted at 0.15 ml/min with a binary gradient from 0%to 100% solvent B (isopropanol:methanol, 70:30 [v/v], 0.02% [m/v] formicacid, 0.01% [m/v] ammonium hydroxide): 0-10 min, 0% B; 17-22 min, 50% B;30-35 min, 100% B; 40-44 min, 0% B, followed by additional 6 min 0% Bpostrun. Raw data files were converted to mzData using MassHunter andprocessed in R using the XCMS (version 1.24)(3) centWave peak finder(4). XCMS (http://metlin.scripps.edu/xcms/index.php) deconvoluted andaligned across samples using s/n threshold of 5, a maximum tolerated m/zdeviation of 10 ppm, a frame width of mzdiff=0.001, a peak width of20-120 s and a band width of 5.

Comparative lipidomics. XCMS data matrices listing detected features,median m/z and median RT of triplicate lipidic extracts was importedinto GenePattern (Broad Institute) using MultiplotPreprocess andMultiplot modules (5).

Protein expression and purification. Rv3378c and GroES/GroEL chaperoneswere coexpressed in BL21-CodonPlus™ (Stratagene) cells. Cell cultureswere grown at 37° C. until OD600 reached ˜0.6 and induced with 0.2 mMisopropyl β-D-thiogalactopyranoside (IPTG) and 0.2% (w/v) L-arabinose at22° C. overnight. Cells were lysed by sonication and lysate was purifiedon a Ni-NTA HisTrap™ FF column (GE Healthcare). Partially purifiedRv3378c was cleaved with thrombin at 4° C. overnight, loaded onto theNi-NTA column, and flow-through fractions were concentrated and purifiedby gel filtration on a Superdex™ 75 (GE Healthcare).

Rv3378c enzymatic assays. Fifty-six micrograms of dried TbPP or GGPPwere resuspended in 1 mL of pH 7.4 Tris-HCl buffer (1 mM MgCl2, 0.1%Triton X-100 (w/v)) by sonication. Thirty-three μg of adenosine (Sigma)prepared at 1 mg/mL in pH 7.4 Tris-HCl buffer (33 μL) and 51 μL ofrecombinant Rv3378c at 16 mg/mL were added to the lipid solution andincubated 4 hr at 37° C. under magnetic agitation. Lipid products wereextracted three times from the reaction mixture using chloroform (3×0.5mL), pooled, dried and analyzed by HPLC-MS as described above. Thedetection of 1-TbAd was confirmed based on m/z mass accuracy, retentiontime and MS/MS experiments (30 eV).

Cloning Rv3378c gene from M. tuberculosis. The Rv3378c gene (GenBank™accession number: CAA15763.1) was amplified by PCR from M. tuberculosisH37Rv genomic DNA using PfuTurbo DNA polymerase (Stratagene),introducing flanking NdeI and XhoI restriction sites. Amplified anddigested PCR products were ligated in predigested pET-28b vector(Novagen), resulting an N-terminal cleavable hexahistidine tag followedby the protein coding sequence. Clones were verified by DNA sequencing(Elim Biopharm).

Transposon mutant library screening. Transposon mutants from a randomlibrary (50) were grown in 96 well format in Middlebrook 7H9 media toconfluence and heat killed, followed by extraction with 100 μL of 70:30methanol:isopropanol and shaking for 5 minutes. 100 μL aliquot wastransferred to a Millipore 96 well filter plate and centrifuged at 4500rpm for 10 minutes. The collected filtrate was used for rapid HPLC-MSanalysis using an isocratic gradient of 70:30 methanol:isopropanol forthree minutes. 1-TbAd production was monitored in MS positive modespectra at 540.35 m/z and in MS/MS positive mode spectra by thedetection of the adenine fragment at 136.06 m/z. Mutants negative forthese ions were recorded as potential 1-TbAd null strains, which wereconfirmed using a full lipidomic analysis.

Rv3377c-Rv3378c knock-in M. smegmatis strain or complementation of M.tuberculosis. Wild-type M. smegmatis or TbAd deficient M. tuberculosisstrains were transformed with a plasmid that episomally expressesRv3377c-Rv3378c genes under the control of a tetracycline induciblepromoter (pTETGW) (6).

Rv3378c and GroES/GroEL proteins expression and purification.

The Rv3378c gene (GenBank™ accession number: CAA15763.1) was amplifiedby PCR from M. tuberculosis H37Rv genomic DNA using PfuTurbo DNApolymerase (Stratagene) and cloned into pET-28b vector (Novagen).Rv3378c mutants were generated using the QuikChange method (Stratagene).All clones were verified by DNA sequencing (Elim Biopharm).

Rv3378c and GroES/GroEL chaperones were coexpressed in BL21-Codon Plus™(Stratagene) cells to improve the solubility of Rv3378c. Cell cultureswere grown at 37° C. until OD600 reached ˜0.6 and induced with 0.2 mMisopropyl β-D thiogalactopyranoside (IPTG) and 0.2% (w/v) L-arabinose at22° C. overnight. Cells were harvested by centrifugation (4,500 rpm, 20min), resuspended in 20 mM Hepes, pH 7.5, 500 mM NaCl, 0.5 mM TCEP, and25 mM imidazole with EDTA free protease inhibitor cocktail (Roche).Resuspended cells were lysed by sonication and centrifuged (16,000 rpm,90 min). Cleared lysate was purified on a Ni-NTA HisTrap™ FF column (GEHealthcare) with gradient elution using buffer containing 300 mMimidazole. Partially purified Rv3378c fractions were cleaved withthrombin at 4° C. overnight, loaded onto the Ni-NTA column, andflow-through fractions were concentrated and purified by gel filtrationon a Superdex™ 75 (GE Healthcare) column equilibrated in 20 mM Hepes, pH7.5, 50 mM NaCl, 0.5 mM TCEP, 10% glycerol.

Crystallographic structure determination of Rv3378c. Purified Rv3378c(10 mg/mL) was crystallized by vapor diffusion from 100 mM citrate, pH3.5, 10-15% (w/v) polyethylene glycol 3350. A cluster of crystals wasseparated by gentle mechanical prodding with a cat whisker. Theresulting single crystals were transferred to mother liquor containing25% ethylene glycol and directly plunged into liquid nitrogen prior todata collection. X-ray diffraction data were collected at 100 K on theAdvanced Light Source (ALS) beamline 8.3.1 and processed using HKL2000(7). The 2.20-Å resolution native data set and 2.30-Å resolutionethylmercury phosphate derivative data set were collected at wavelengthsof 1.1111 and 1.0083 Å, respectively. Different crystal forms wereobserved by additional screening with Silver Bullets HT kit (HamptonResearch), and a 2.36 Å resolution data set was collected at 1.1111 Å at100 K on ALS beamline 8.3.1. The structure of Rv3378c was solved by SADphasing of a mercury derivative using Phenix AutoSol (8). Initial modelsbuilt by Phenix AutoBuild (8) were improved using ARP/warp (9), followedby manual building in Coot (10). The native structures were solved bymolecular replacement using the mercury-derivatized structure as asearch model in Phaser (11). Structures were refined using Phenix Refine(8), with exclusion of 10% of the reflections to calculate Rfree. Modelswere validated using Molprobity (12). Secondary structures were assignedusing DSSP (Dictionary of Protein Secondary Structure) (13) andstructural figures were generated using PyMOL (http://www.pymol.org/)(14).

Purification of 1-tuberculosinyladenosine (Substance A). Gram quantitiesof M. tuberculosis H37Rv and H37Ra were extracted three times withchloroform and methanol solution as described above. 500 mg of lipidextract was concentrated under nitrogen, and the lipid slurry was loadedon an open silica gel column (2 cm×1.6 cm) using chloroform. Fractionswere eluted with the following sequence of solvents: chloroform, 95:5chloroform/isopropanol, 95:5, 90:10 and 50:50 chloroform/methanol (v/v)with ion monitoring (m/z 540.5) to track substance A, which eluted inthe 95:5 (v/v) chloroform/methanol and the 50:50 chloroform/methanolfractions. After drying, reversed phase HPLC (Waters Corporation)purification of pooled fractions enriched for the target ion was carriedout using octadecyl-modified silica (5 micron) semi-preparative column(Higgins Analytical HAISIL C18, 250×10 mm). Using an isocratic 450:50:1methanol/water/trifluoroacetic acid (v/v/v) gradient with a flow rate of3.0 mL/min substance A appeared at 8 min. After drying with nitrogen anda 5-fold excess of acetonitrile HPLC chromatography was repeated givingpure 1-TbAd as assessed by MS and NMR spectroscopy.

METHODS REFERENCES

References

1. Madigan C A, et al. (2012) Lipidomic discovery of deoxysiderophoresreveals a revised mycobactin biosynthesis pathway in Mycobacteriumtuberculosis. Proc Natl Acad Sci USA 109(4):1257-1262.

2. Layre E, et al. (2011) A comparative lipidomics platform forchemotaxonomic analysis of Mycobacterium tuberculosis. Chem Biol18(12):1537-1549.

3. Smith C A, Want E J, O'Maille G, Abagyan R, & Siuzdak G (2006) XCMS:processing mass spectrometry data for metabolite profiling usingnonlinear peak alignment, matching, and identification. Anal Chem78(3):779-787.

4. Tautenhahn R, Bottcher C, & Neumann S (2008) Highly sensitive featuredetection for high resolution LC/MS. BMC Bioinformatics 9:504.

5. Reich M, et al. (2006) GenePattern 2.0. Nat Genet 38(5):500-501.

6. Sassetti C M, Boyd D H, & Rubin E J (2001) Comprehensiveidentification of conditionally essential genes in mycobacteria. ProcNatl Acad Sci USA 98(22):12712-12717.

7. Otwinowski W & Minor W (1997) Processing of X-ray diffraction datacollected in oscillation mode. Methods in Enzymology, eds Charles W &Carter J (Academic Press), Vol 276, pp 307-326.

8. Adams P D, et al. (2010) PHENIX: a comprehensive Python-based systemfor macromolecular structure solution. Acta Crystallogr D BiolCrystallogr 66(Pt 2):213-221.

9. Langer G, Cohen S X, Lamzin V S, & Perrakis A (2008) Automatedmacromolecular model building for X-ray crystallography using ARP/wARPversion 7. Nat Protoc 3(7):1171-1179.

10. Emsley P & Cowtan K (2004) Coot: model-building tools for moleculargraphics. Acta Crystallogr D Biol Crystallogr 60(Pt 12 Pt 1):2126-2132.

11. McCoy A J, et al. (2007) Phaser crystallographic software. J ApplCrystallogr 40(Pt 4):658-674.

12. Chen V B, et al. (2010) MolProbity: all-atom structure validationfor macromolecular crystallography. Acta Crystallogr D Biol Crystallogr66(Pt 1):12-21.

13. Kabsch W & Sander C (1983) Dictionary of protein secondarystructure: pattern recognition of hydrogen-bonded and geometricalfeatures. Biopolymers 22(12):2577-2637.

14. Anonymous (The PyMOL Molecular Graphics System, Version 1.5.0.5,Shrodinger, LLC.

15. Maugel N, Mann F M, Hillwig M L, Peters R J, & Snider B B (2010)Synthesis of (+/−)-nosyberkol (isotuberculosinol, revised structure ofedaxadiene) and (+/−)-tuberculosinol. Org Lett 12(11):2626-2629.

16. Davisson V J, et al. (1986) Org. Chem. (51):4768.

Experiments

To identify lipids with roles in tuberculosis disease, we systematicallycompared the lipid content of virulent Mycobacterium tuberculosis withthe attenuated vaccine strain M. bovis BCG. Comparative lipidomicsanalysis identified more than 1,000 molecular differences, including apreviously unknown, M. tuberculosis-specific lipid that is composed of aditerpene unit linked to adenosine. We established the completestructure of the natural product as 1-tuberculosinyladenosine (1-TbAd)using mass spectrometry and nuclear magnetic resonance (NMR)spectroscopy. A screen for 1-TbAd mutants, complementation studies andgene transfer identified Rv3378c as necessary for 1-TbAd biosynthesis.Whereas Rv3378c was previously thought to function as a phosphatase,these studies establish its role as a tuberculosinyl transferase anddescribe a new biosynthetic pathway for the sequential action ofRv3377c-Rv3378c. Ruling in this model, recombinant Rv3378c proteinproduced 1-TbAd, and its crystal structure revealed a cis-prenyltransferase fold with hydrophobic residues for isoprenol binding and asecond binding pocket suitable for the nucleoside substrate. Thedual-substrate pocket distinguishes Rv3378c from classical cis-prenyltransferases, providing a new model for the prenylation of diversemetabolites. Terpene nucleosides are rare in nature and 1-TbAd is knownonly in M. tuberculosis. Thus, this intersection of nucleoside andterpene pathways likely arose late in the evolution of the M.tuberculosis complex. 1-TbAd serves as an abundant chemical marker of M.tuberculosis, and the extracellular export of this amphipathic moleculelikely accounts for the known virulence-promoting effects of thecytosolic Rv3378c enzyme.

Introduction

Mycobacterium tuberculosis remains one of the world's most importantpathogens, with a mortality rate exceeding 1.5 million deaths annually(1). M. tuberculosis succeeds as a pathogen due to productive infectionof the endosomal network of phagocytes. Its residence within thephagosome protects it from immune responses during its decades longinfection cycle. However, intracellular survival depends on activeinhibition of pH-dependent killing mechanisms, which occurs for M.tuberculosis, but not species with low disease-causing potential (2).Intracellular survival is also enhanced by an unusually hydrophobic andmulti-layered protective cell envelope. Despite study of this pathogenfor more than a century, the spectrum of natural lipids within M.tuberculosis membranes is not yet fully defined. For example, theproducts of many genes annotated as lipid synthases remain unknown (3),and mass spectrometry detects hundreds of ions that do not correspond toknown lipids in the MycoMass and LipidDB databases (4, 5).

To broadly compare the lipid profiles of virulent and avirulentmycobacteria, we took advantage of a recently validated metabolomicsplatform (4). This high performance liquid chromatography-massspectrometry (HPLC-MS) system uses methods of extraction, chromatographyand databases that are specialized for mycobacteria. After extraction oftotal bacterial lipids into organic solvents, HPLC-MS enables massivelyparallel detection of thousands of ions corresponding to diverse lipidsthat range from apolar polyketides to polar phosphoglycolipids.Software-based (XCMS) ion finding algorithms report reproduciblydetected ions as molecular features. Each feature is a 3-dimensionaldata point with linked mass, retention time and intensity values fromone detected molecule or isotope. All features with equivalent mass andretention time from two bacterial lipid extracts are aligned, allowingpairwise comparisons of MS signal intensity to enumerate molecules thatare overproduced in one strain with a false positive rate below 1percent (4).

This comparative lipidomics system allowed an unbiased, organism-wideanalysis of lipids from M. tuberculosis and the attenuated vaccinestrain, Mycobacterium bovis Bacille Calmette Guerin (BCG). BCG waschosen because of its worldwide use as a vaccine and its geneticsimilarity to M. tuberculosis (6). We reasoned that any features thatare specifically detected in M. tuberculosis might be clinically usefulas markers to distinguish tuberculosis-causing bacteria from vaccines.Further, given the differing potential for productive infection by thetwo strains, any M. tuberculosis-specific compounds would be candidatevirulence factors. Comparative genomics of M. tuberculosis and BCGsuccessfully identified “regions of deletion” (RD) that encode genesthat were subsequently proven to promote productive M. tuberculosisinfection (7), including ESX-1 (8, 9). We reasoned that ametabolite-based screen might identify new virulence factors because notall functions of RD genes are known. Also, biologically importantmetabolites could emerge from complex biosynthetic pathways that cannotbe predicted from single gene analysis.

Comparison of M. tuberculosis and BCG lipid profiles revealed more than1,000 differences, among which we identified a previously unknown. M.tuberculosis-specific diterpene-linked adenosine and showed that it isproduced by the enzyme Rv3378c. Previously, Rv3378c was thought togenerate free tuberculosinols (10-12). This discovery revises theenzymatic function of Rv3378c, which acts as a virulence factor toinhibit phagolysosome fusion (13). Whereas current models of prenyltransferase function emphasize iterative lengthing of prenylpyrophosphates using one binding pocket, the crystal structure ofRv3378c identifies two pockets in the catalytic site, establishing amechanism for heterologous prenyl transfer to non-prenyl metabolites.

Results

Comparative Lipidomics of M. Tuberculosis and BCG

Using HPLC-MS for comparative analysis of lipid extracts of M.tuberculosis H37Rv and BCG (Pasteur strain), we detected 7,852 molecularfeatures (FIG. 1, and data not shown). By aligning datasets and seekingfeatures that significantly differed in intensity (correctedp-value<0.05), we identified 1,845 features that were overexpressed inone bacterium or the other (FIG. 1A). Among these features, we focusedon molecules selectively expressed in M. tuberculosis that showed thehighest fold-change ratios and intensity. We identified four molecularfeatures corresponding to a singly charged molecular ion at m/z 540.357(C30H45N5O4) and its isotopes (FIG. 1A), but this chemical formula didnot match entries in the MycoMass (4) or other public databases. Wenamed the unknown molecule substance A.

Substance A is an Abundant Natural Product of M. Tuberculosis

The molecular ion of substance A was one of the most intense ions in theM. tuberculosis lipidome (FIG. 1A), suggesting that it was produced inabundance. Identification of an apparently abundant molecule in a widelystudied pathogen was unexpected, leading to questions about whethersubstance A was truly a natural product. However, this compound wasabsent in media, solvent blanks and BCG lipid extracts, but wasreproducibly detected in three reference strains of M. tuberculosis(FIG. 1B). As observed with cell-associated compounds (FIG. 1A), culturefiltrate (FIG. 1C) yielded bright ions, whose intensity was higher thanthat of the abundantly secreted siderophore, carboxymycobactin. Itsrelease into the extracellular space likely results from trans-membranetransport, rather than budding of intact cell wall fragments, as cellwall-embedded lipids, trehalose monomycolate and mycobactin, were notdetected in filtered supernatants (FIG. 1C). We detected substance A inM. tuberculosis during exponential or stationary phase and several typesof media, or when subject to acid stress (FIG. 8A and FIG. 8B). Thus,substance A is a natural product, which is constitutively produced inmany conditions and accumulates within and outside M. tuberculosis.

M. tuberculosis often compartmentalizes lipid biosynthesis so thatlipids are assembled after transport across the plasma membrane.Sulfoglycolipids and phthiocerol dimycocerosates become undetectablewhen MmpL transporters are interrupted, even when biosynthetic genes areintact (14-16). Because ESX-1 is a transport system lacking in BCG, lackof export of an ESX-1 dependent lipid synthase might account for theloss of substance A. However, ESX-1 deficient M. tuberculosis lackingeither the espA gene (Rv361 6c) or the entire RD1 locus (17), which areboth necessary for ESX-1 function, produces substance A at normal levels(FIG. 8C). After ruling out a major known specifies-specific differencein transport, we devised a screen to detect biosynthesis genesresponsible for substance A.

Substance A is a 1-Tuberculosinyladenosine

Collision-induced mass spectrometry (CID-MS) identified the structuralcomponents of substance A as adenine ([M+H]⁺, C₅H₆N₅, m/z 136.0618),adenosine ([M+H]⁺, C₁₀H₁₄N₅O₄, m/z 268.1040), and a polyunsaturated C20hydrocarbon ([M+H]⁺, C₂₀H₃₃, m/z 273.2576) (FIG. 2 and FIG. 14). Acommon C20 diterpene is geranylgeraniol, and M. tuberculosis producestwo C20 lipids containing bicyclic hallimanane skeletons, tuberculosinoland isotuberculosinol (18-20). Initially, CID-MS spectra could notdistinguish among these three candidate diterpenes (FIG. 14, FIG. 15),but multi-stage CID-MS studies isolated the diterpene unit of substanceA (m/z 273.3) and yielded collision patterns that matched tuberculosinolmore closely than geranylgeraniol (FIG. 14).

After purification of the natural product, we carried out NMRspectroscopy analyses using ¹H 1D, 2D COSY, HMQC and NOESY spectra (datanot shown, Summary FIG. 15), which unequivocally established thestructure of substance A as 1-tuberculosinyladenosine (1-TbAd) (FIG. 2).The NMR signals of the diterpene moiety matched those of tuberculosinol(10, 19-21) except for the expected difference in the side chain protonsand carbons. The spectral data of the adenosine and adjacent atomscorrespond closely to those of 1-prenyladenosine analogues (22-24). Theallylic methylene group absorbs downfield as a doublet at δ 4.92 (J=6.6Hz). A NOESY cross peak between the adenine H-2 at δ 8.53 and the alkenehydrogen and allylic methylene, and methyl groups at δ 5.46, 4.92 and1.89, respectively, confirm that the tuberculosinyl group is attached tothe adenine at position 1. Thus, M. tuberculosis produces a previouslyunknown type of diterpene nucleoside.

Rv3378c Produces 1-TbAd

To identify the genes necessary for 1-TbAd production, an existinglibrary of random transposon insertional mutants (25) was screened inhigh throughput (4,196 mutants) for 1-TbAd production using a simplified3 minute HPLC-MS method (FIG. 3A). Thirty mutants showing low or absentsignals were rescreened using the original, high resolution lipidomicseparation method (FIG. 1). Reporting only mutants with complete signalloss of TbAd signal in both assays, we identified two 1-TbAd-nullmutants carrying transposons in Rv1796 (mutant 1) and Rv2867c (mutant 2)(FIG. 3B). The concurrently performed biochemical studies describedabove identified the highly characteristic tuberculosinyl moiety as acomponent of 1-TbAd, and the Rv3377c-Rv3378c locus was known to encodeenzymes needed for tuberculosinol production (10, 11, 18-21). Sequencingidentified spontaneous mutations in Rv3378c in both mutants (10, 18-21).Mutant 1 encoded a predicted Asp→Gly substitution at residue 34, andmutant 2 encoded a Pro→Ser substitution at residue 231. We generatedcomplementation constructs to separately test whether the pointmutations in Rv3378c or the transposon insertions were responsible for1-TbAd loss. Transfer of Rv1796 and Rv2867c failed to restore 1-TbAdproduction (FIG. 9), but transfer of Rv3377c-Rv3378c reconstituted1-TbAd production in both mutants (FIG. 3C). Thus, Rv3377c-Rv3378c genesare necessary for 1-TbAd biosynthesis in M. tuberculosis.

The Biosynthetic Pathway of 1-TbAd

Further, the known role of Rv3377-Rv3378c in tuberculosinol productionpotentially provided a mechanism to connect these genes with theproduction of a nucleotide-modified tuberculosinol. Rv3377c is a terpenecyclase, which acts on geranylgeranyl pyrophosphate (GGPP) to generatetuberculosinyl pyrophosphate (TbPP). Rv3378c was thought to be aphosphatase, which converts TbPP to free tuberculosinol (10, 21).Extending current models (FIG. 4A), 1-TbAd might result from downstreamaction of an unknown gene on free tuberculosinol to transfer it toadenosine. Polyprenol synthase genes and the

Rv3377c-Rv3378c locus are coordinately regulated and encoded at adjacentsites on the chromosome (26). Therefore, we searched M. tuberculosisdatabases for genes located near this locus that might plausiblyfunction as adenosine transferases. We failed to find candidates andnoted that no transposon insertion that blocked 1-TbAd production mappedto genes adjacent to this loci.

Therefore, we considered a new biosynthetic model in which Rv3378cprotein is not a simple phosphatase, as currently believed, but insteadacts with combined phosphatase and tuberculosinyl transferase functions,using adenosine as the nucleophilic substrate (FIG. 4B). This model ismechanistically simple and might explain the lack of an apparentstand-alone transferase gene. Also, whereas current models predict thattuberculosinol is the end product of this pathway, we did not detecttuberculosinol in lipidomics experiments (FIG. 1A and data not shown).The revised model posits that 1-TbAd is the endproduct of Rv3378cpathway, explaining why it accumulates to high levels as one of thebrightest ions in the lipidome (FIG. 1A). After chemical synthesis ofTbPP, we tested TbPP and GGPP as substrates for the recombinant Rv3378cprotein (18). Rv3378c catalyzed the condensation of adenosine and TbPPto generate 1-TbAd, but produced little or no product from GGPP and freetuberculosinol was not detected in these assays (FIG. 4C and FIG. 4D).Thus, Rv3378c is a tuberculosinyl transferase ruling in the revisedbiosynthetic pathway (FIG. 4B).

Rv3377c-Rv3378c is Sufficient for TbAd Biosynthesis in Cells

To test the sufficiency of this locus for 1-TbAd production in cells, wetransferred the Rv3377c-Rv3378c locus to M. smegmatis. In all threeclones tested, expression of Rv3377c-Rv3378c transferred production of amolecule with the mass, retention time and CID-MS spectrum of 1-TbAd(FIG. 5 and FIG. 10). Thus, no other M. tuberculosis-specific co-factoror transporter is needed for 1-TbAd production. Rv3377c-Rv3378c issufficient to synthesize 1-TbAd from ubiquitous cellular precursorspresent in most bacteria, likely GGPP and adenosine.

Crystal Structure of Rv3378c

To understand if the active site of Rv3378c is compatible with therevised function as a tuberculosinyl transferase, we determined itscrystal structure. Lacking proteins with high sequence similarity,single-wavelength anomalous dispersion phasing was used to calculate theinitial electron density map. The model was refined against native datato 2.2 Å resolution (data not shown). As expected from gel filtrationstudies, Rv3378c formed a homodimer (FIG. 6A). Although structuralsimilarity was not predicted by sequence comparisons, Rv3378c adopts thefold seen in (Z)-prenyl, or cis-prenyl transferases (27), including M.tuberculosis (Z)-farnesyl diphosphate synthase (Rv1086) and decaprenylpyrophosphate synthase (Rv2361c), as well as E. coli undecaprenylpyrophosphate synthase (UPP) (28, 29) (FIG. 6B). These enzymes condensean allyl pyrophosphate and the 5-carbon isopentyl pyrophosphate buildingblock to produce linear isoprenoids (28, 29).

Structural Insight into Prenyl Unit Binding

In considering competing models that Rv3378c might simply hydrolyze theTbPP pyrophosphate, or carry out the newly proposed role in adenosinetransfer (FIG. 5A and FIG. 5B), we superimposed Rv3378c with thepseudo-substrate and product complexes of Rv2361c (29) to model anenzyme-substrate (ES) complex. In contrast to other (Z)-prenyltransferases, Rv3378c has a unique C-terminal helical segment (residues251-end), which contributes to domain swapping. An extra N-terminalhelical segment (residues 6-24) packs via hydrophobic interactions withadjacent helices (FIG. 6A and data not shown,).

Rv3378c shares functional motifs with the (Z)-prenyl transferases,including residues for substrate binding and catalysis; Asp34, Arg37 andArg38 (FIG. 6B). (Z)-prenyl transferases bind the allyl pyrophosphatesubstrate through a characteristic DGNG/RRW amino acid sequence motifstarting two residues before the N-terminus of an alpha helix (α3). Theaspartate chelates a magnesium ion, while the glycine, the helixterminus and the arginine(s) engage the pyrophosphate (FIG. 6B and FIG.6C) (27, 28, 30). In Rv3378c, Asp34 sits in the expected position tocarry out its essential catalytic function providing a specificmechanism that likely explains why mutant 1, which contains an Asp34→Glyalteration, does not produce TbAd. As predicted by prior studies showingthe role of aspartate in prenyl transfer (27, 28, 30), and the conservedlocation of Asp34 vis-á-vis the prenyl binding site (FIGS. 6A-B),mutation to asparagine or alanine abolished the prenyl transferasefunction of Rv3378c (FIG. 11). In Rv2361c, the isoprene binding site isa hydrophobic pocket located between the β-sheet and the α2 (residues89-110) and α3 (residues 129-152) helices (29). Rv3378c contains all ofthese features (FIG. 6C), including the 34-DGTRRW-39 motif and a deeppocket adjacent to helices α4 (residues 51-68) and α5 (residues 96-103).Hydrophobic residues (L56, L63, L100 and L101) are located in the pocketcreated by helices α4 and α5, and other hydrophobic residues (F33, 178,F158) further contribute to the hydrophobic character of the pocket.This binding pocket is predicted to position the pyrophosphate group ofTbPP, which can interact with Arg37 and Arg38 from the DGTRRW motif andTyr51 from the N-terminus of helix α4 (FIG. 6D).

A Second Pocket at the Catalytic Site

The binding mode of the nucleophilic adenosine substrate is harder tomodel, because the binding site is likely to be completed by the closureof the P-loop over the active site when native substrate is present. TheP-loop is disordered in the unliganded structure, but it becomes orderedin a non-physiological complex with mellitic acid (data not shown). Thisstructure suggests a specific mechanism by which substrate bindingprovides polar interactions with the P-loop to exclude water from theactive site. Other considerations provide pertinent clues about theadenosine binding mode. As contrasted with Rv2361c, Rv1086 and UPPsynthase, Rv3378c has a second, side pocket that can accommodateadenosine (FIGS. 6D and E). Superimposing N1 of the adenine on the IPPnucleophile in complex with Rv2361c (29) guides the positioning of theadenosine substrate in the active site of Rv3378c. In Rv2361c, thepyrophosphate of IPP interacts with Arg244 and Arg250 (29).Corresponding to the fact that adenosine lacks the pyrophosphate,Rv3378c lacks this conserved pair of arginines, which are replaced withglycine and serine. These features distinguish Rv3378c from known(Z)-prenyl transferases and are consistent with adenosine binding andtransfer.

Substance B contains a core TbAd structure

Returning to the whole-organism screen (FIG. 1a ), an independent effortto characterize substance B uncovered unexpected structural similaritiesto TbAd (substance A). The B cluster of 66 features was deconvoluted toidentify individual features with properties of members of an alkylseries. The 66 features represented a pattern of two overlapping butnon-identical alkyl series (B1 and B2) and their isotopes (data notshown). In particular, comparison of m/z 1659.514 and m/z 1775.632,which were the dominant molecular ions in the two alkyl series, yields amass difference of 116.117 amu (data not shown). This mass differencematches within 0.003 amu to C7F1100, the characteristic differencebetween alpha (fCH2],e-CH═CH—CH3) and methoxy ((CH21o9-OCH3) mycolicacid. Separately, the detected m/z at 1659.514 (calculated 1659.510) and1775.632 (calculated 1775.630) correspond to the expected masses of TbAdsubstituted with C78 a- and C85 methoxy-mycolic acids, respectively(data not shown).

Confirming the hypothesis that these ions represented mycolyl TbAd(MTbAd), CID-MS yielded the diagnostic fragments observed in the TbAdMS/MS spectrum, including ions at m/z 136,06, 273.25, and 408.31assigned to protonated adducts of adenine, tuberculosinol, andtuberculosinyl adenine motifs, respectively (data not shown).Furthermore, the fragments detected at 1387.263 (calculate 1387.267) and1503.375 (calculated 1503.387) m/z matched those of a C78 alpha and aC85 methoxy mycolic acid-linked adenosine, respectively (data notshown). Last, all 66 features detected in the B cluster could beexplained by an alkyl series and isotopes of TbAd carrying individualmycolic acids with expected chain length (C76-C88) and R-groups patternsnormally produced by M. tuberculosis (FIG. 1A). In contrast to TbAd,ions corresponding to MTbAd were weak (FIG. 1A), and the natural MTbAdproduct could not be purified in adequate yield for NMR studies.Nevertheless, the CID-MS spectra, mass accuracy and one to onecorrespondence of 15 deduced structures to the known chain length andR-group variants of naturally occurring mycolyl variants provide strongevidence for mycoloylated form of TbAd in M. tuberculosis.

Rv3378c connects biosynthetic pathways

Nearly all bacteria express the enzymes needed for biosynthesis ofterpenes and nucleosides, which normally have quite distinct functionsin cell biology. However, these data suggested a model in which M.tuberculosis' expression of Rv3377c-3378c operon provides an unexpectedconnection of these evolutionary ubiquitous pathways to create a hybridterpene-nucleoside, which has few precedents in nature. TbAd itselfmight show broader expression or strict restriction to M. tuberculosis.To study the distribution of TbAd among microbes, HPLC-MS monitoring forTbAd ions failed to detect signal in total lipid extracts fromrepresentative fungal (C. albicans, A. fumigatus), Gram-positive (S.aureus) and Gram-negative (E. cols) bacterial species (data not shown).Among bacteria more closely related to M. tuberculosis, we could notdetect TbAd ions in non-mycobacterial Actinomycetales, environmentalmycobacteria (M. smegmatis, M. fallax) or M. bovis (data not shown).Ions matching TbAd were present in three divergent M. tuberculosisreference strain (FIG. 1B) and clinical isolates (data not shown) of M.tuberculosis. This pattern of TbAd production matches the distributionof the Rv3377c-Rv3378c operon, which is expressed only in M.tuberculosis strains. M. bovis BCG contains the operon but, as in M.bovis, Rv3377c is inactivated by a frameshift mutation 27. Inconsidering how M. tuberculosis, among all tested organisms, acquiredthe TbAd biosynthesis pathway, we wondered whether Rv3377c-3378c wassufficient to make TbAd in an unrelated mycobacterium, or whether otherunknown, but specialized M. tuberculosis-encoded accessory molecules ortransport systems might be involved. The expression of Rv3377c-3378c inM. smegmatis was sufficient for production of TbAd among all threeisolates tested (FIG. 5). Thus, transfer of the Rv3377c-Rv3378c genes issufficient to reconstitute the TbAd biosynthesis in intact M. smegmatis.In contrast, the mycoloylated forms of TbAd were not detected in M.smegmatis knock-in strain suggesting that the generation of theseacylated TbAd requires M. tuberculosis specific mycolyl transferase.

Here we report the discovery of TbAd as an abundant extracelluiarproduct of M. tuberculosis. This novel compound was detected in threeforms, N′-TbAd (positively charged at neutral pH), N⁶-TbAd (neutral atneutral pH; Young et al., submitted) and mycoloyl-TbAd. The constitutiveproduction of these compounds under several growth conditions confirmstheir classification as natural products. T bAd distinguishes virulentM. tuberculosis from all other species, including the BCG vaccinestrain, environmental mycobacteria, and other non-Actinomycetalesbacteria. Among microbes studied to date, the pattern of Rv3377c-Rv3378cexpression is in agreement with previously described horizontalacquisition of Rv3376-Rv3378c by M. tuberculosis complex. Importantly,this distribution implies that TbAd is an abundant biomarker specificfor M. tuberculosis.

Higher order terpene-nucleosides are rare in nature, and we have notidentified a direct precedent for N′-linked prenyl adenosines. A C35terpene cyclase activity is found in non-pathogenic mycobacteria 28′29,however Rv3377c orthologs are only functional in M. tuberculosisstrains. Plants and marine sponges produce terpene-pu.

In an attempt to determine if a bacteria or cellular organisms otherthan M. tuberculosis produce 1-TbAd, we sought orthologs of Rv3377c andRv3378c, the two enzymes required for TbAd synthesis in M. tuberculosis.We focused on organisms that commonly cause lung disease or are used forvaccination in ways that can cause false positive ELISA tests. The basiclocal alignment search tool (BLAST) and a low stringency match criterion(30 percent amino sequence identity) failed to identify two orthologs ofthese biosynthetic genes in any species. Considering even bacteria withhigh genetic relatedness to M. tuberculosis, orthologs of Rv3377c andRv3378c could not be identified in most actinobacteria, includingdisease-causing members of the M. avium complex, M. kansasii and M.marinum. Within the M. tuberculosis complex (MTC), M. bovis and thevaccine strain M. bovis BCG (Pasteur strain), M. Cannetti and M.africanum encode identifiable orthologs of both genes. However, manycoding mutations are found in this locus in MTC species other than M.tuberculosis. These include (G31 V) in Rv3377c of M. africanum. and M.cannetti has four coding mutations (T253A, V340I, A357Q, V361I, R497E).Rv3378c in M. cannetti (L230I). M. bovis and the Pasteur strain of BCG(Pasteur) encode a frameshift mutation at nucleotide 1223, and Pasteurencodes a second point mutation (A137V). Frameshift mutations typicallyinactivate the enzyme, and the framshift likely represents a cause ofthe known lack of 1-TbAd detected in the Pasteur strain of BCG. Furtherwe found that all of the common BCG strains used worldwide forvaccination (Pasteur, Copenhagen, Japan, Mexican, Australian, Russia,Glaxo, Prague, Phipps, Connaught, Denmark, Tice) all contained theinactivating mutation, so one TbAd is likely absent from all commonlyused BCG vaccines in the world.

Direct biochemical analysis of key strains for 1-TbAd production focusedon microbes whose infection might mimic tuberculosis disease. Asexpected from the genetic analysis, lipid extracts fromnon-actinomycetes (Escherichia coli, Staphylococcus aureus) and fungi(Aspergillis fumigatus, Candida albicans) did not produce 1-TbAdsignals.

Turning to mycobacteria, we could not detect 1-TbAd by HPLC-MS amongreference strains of environmental (M. fallax) or non-pathogeniclaboratory strains of mycobacteria (M. smegmatis, M. phlei) that lackRv3378c orthologs. In agreement with genetic results, we did not detect1-TbAd among disease causing bacteria that are related to M.tuberculosis but lack identifiable orthologs of Rv3377c-3378c (M. avium,M. marinum) or those with ortholgous loci with a known frameshiftmutation (M. bovis). Among all strains tested to date, only M.tuberculosis produces 1-TbAd at detectable concentrations. These datasupport the conclusion that 1-TbAd and its biosynthetic genes arelacking from most or all mycobacteria other than M. tuberculosis.Specifically 1-TbAd and its genes were not detected in any knownenvironmental bacterium. Thus, environmental bacteria are unlikely tocause false positive results in tests that target 1-TbAd or patientantibodies to 1-TbAd.

Discussion

Overall, structural, genetic and biochemical data strongly suggest arevised function of Rv3378c as a tuberculosinyl transferase thatproduces 1-TbAd, an abundant amphiphile that is exported outside M.tuberculosis. This result establishes the efficacy of unbiased lipidomicscreens to identify previously unknown compounds. A C35 terpene cyclaseactivity is found in non-pathogenic mycobacteria (31, 32), but Rv3377corthologs are only known within M. tuberculosis complex. Higher orderterpene-nucleosides are rare in nature, and we have not identified aprecedent for 1-linked prenyl adenosines. Plant and marine spongesproduce terpene-purine derivatives, such as cytokinins and agelasines,which regulate growth or show antimicrobial effects (33). However, thesenatural products contain adenine rather than adenosine, and the terpenemoiety is carried at the N⁶ position of the adenine in the cytokininsand N⁷ or N⁹ in the agelasines. Further, among microbes studied to date,we have only detected 1-TbAd in members of the M. tuberculosis complex,suggesting that 1-TbAd production is limited to pathogenic mycobacteria.Orthologs of Rv3377c or RV3378c are limited to the M. tuberculosiscomplex. Although M. bovis and BCG strains encode orthologous genes,strains examined to date encodes a frameshift mutation in Rv3377c (11),and the Pasteur strain used here encodes a second, coding point mutationin

Rv3378c. Thus, both genetic and biochemical evidence suggest that 1-TbAdis a specific marker of M. tuberculosis, supporting the development of1-TbAd or 1-TbAd-specific immune responses as candidate targets fordiagnostic tests for tuberculosis.

The lack of 1-TbAd in BCG might represent evidence that changes inRv3377c-Rv3378c might contribute to the vaccine strain's attenuation.More direct evidence for a role of this locus in virulence comes fromtransposon studies showing that Rv3377c and Rv3378c play non-redundantroles in phagosome-lysosome fusion and survival in macrophages (13).This key finding initiated an intensive search for the actual functionsof these virulence-associated genes. Rv3377c is a terpene cyclase(18-20). Rv3378c has few orthologs in nature, and its biochemicalfunction was not apparent from predictive folding algorithms. Based onin vitro studies, Rv3378c is currently thought to function as a TbPPpyrophosphatase that yields free tuberculosinol (10). Synthetictuberculosinols coupled to beads block phagosomal acidification (21).However, end products of biosynthetic pathways typically accumulate, andto our knowledge, the extent of accumulation of free tuberculosinol as anatural product in intact M. tuberculosis remains unknown. We did notdetect free tuberculosinols in lipidomics analysis of M. tuberculosis oramong in vitro products of Rv3378c. This result does not rule outbiosynthesis of free tuberculosinol, but it is notable that 1-TbAd isnot only detected, it substantially accumulates within and outside M.tuberculosis. Further, we prove that the action of Rv3378c is a combinedphosphatase and isotuberculosinol transferase through in vitro study ofpurified proteins, gene transfer to M. tuberculosis and M. smegmatis, aswell as a structural analysis of Rv3378c. Based on parallel lines ofgenetic, biochemical and structural evidence, we propose that Rv3378cshould be known as ‘tuberculosinyl adenosine transferase’.

The structures of enzymes that transfer prenyl pyrophosphates tosubstrates other than linear isoprenoids have not been determinedpreviously. Like other (Z)-prenyl transferases, Rv3378c contains acharacteristic allyl pyrophosphate-binding site, catalytic aspartate andflexible P-loop in position to close over the active site. The canonicalTbPP binding pocket structure is sufficiently conserved that it may besensitive to available drugs or analogues that target other (Z)-prenyltransferases. However, the nucleophile binding site lacks conservedfeatures that mediate recognition of pyrophosphate moiety of isoprenebuilding blocks seen in previously characterized (Z)-prenyltransferases. Instead, Rv3378c active site contains a second,side-pocket in which the adenosine can be positioned for nucleophilicattack on Cl of TbPP. We observed this reaction in vitro and found thatRv3378c does not act on GGPP and specifically produces the 1-linked formof 1-TbAd, defining two aspects of the substrate specificity. Whereasmost prenyl transferases have one identifiable pocket, this newtwo-pocket model suggests a broader paradigm for the prenylation ofmetabolites catalyzed by members of the (Z)-prenyl transferase family.Whereas current models emphasize iterative elongation through therepeated use of one pocket, the dual substrate pocket of Rv3378cprovides a general model for prenylation of non-prenyl substrates.Product specificity is determined by a conventional allyl pyrophosphatebinding site and a second pocket tailored to bind and activate eachtarget nucleophile.

The larger 1-TbAd biosynthetic pathway starts with two evolutionarilyconserved systems, which produce geranylgeranyl pyrophosphate andadenosine. These pathways operate separately in most organisms, but M.tuberculosis joins these two pathways to generate a terpene nucleoside.The appearance of TbAd after transfer of Rv3377c and Rv3378c genes to M.smegmatis proves that additional M. tuberculosis-specific genes, such astransporters are not required for 1-TbAd biosynthesis. More generally,these data represent an experimental demonstration that transfer of twogenes is sufficient to reconstitute a complex metabolite, which likelyrequires more than twenty genes for its complete biosynthesis. Combiningthis observation with data suggesting that the ancestral Rv3377c andRv3378c genes were acquired by horizontal gene transfer (26), a scenarioemerges by which evolutionarily ancient terpene and nucleotidebiosynthetic pathways were joined together by transfer of two genes latein the evolution of the M. tuberculosis complex (26).

1-TbAd likely represents the mechanism by which Rv3378c carries out itsknown effects in promoting M. tuberculosis infectivity. Within minutesof phagocytosis, M. tuberculosis inhibits host defenses, includingphagosome acidification and phagolysosome fusion (34, 35). TheRv3377c-Rv3378c locus is required for optimal phagosome maturationarrest (13). The discovery of extracellular 1-TbAd provides specificinsight into mechanisms by which an enzyme localized in the cytosolaffects events outside the bacterium (13). To our knowledge, neitherRv3378c nor free tuberculosinol have been detected in culture filtrates(18). In contrast, 1-TbAd is an amphiphile that is released into theextracellular space using an export mechanism that is independent ofESX-1.

Future studies will be needed to understand the particular mechanisms bywhich 1-TbAd contributes to the effects of Rv3377c-Rv3378c on phagosomematuration. Adenosine is almost exclusively found inside cells, andterpene chains catalyze the transfer of pyrophosphate across themycobacterial envelope for the biosynthesis of arabinogalactan (36). Byanalogy, prenylation might promote the transit of the nucleoside to thephagosomal space, where the adenosine could engage host receptors.Alternatively, tuberculosinol might be the active moiety (12, 21), whosesolubility or transport is influenced by adenosine. The cellularmechanism leading to altered mycobacterial survival might includechanged integrity of the phagosomal membrane, intraphagosomal protoncapture, or escape of 1-TbAd across the phagosomal membrane and into thehost, where it might signaling globally changes in macrophage function.

Supportive Information

Low mass ion series of substance A. Enlargement of low-mass ion seriesdetected in the MS2 (QTOF) spectrum of substance A from M. tuberculosisthat shown in FIG. 8. The ion at m/z 136 is removed to simplify thegraphical display. Four overlapping low-mass ion series were observedhaving from 1 to 4 unsaturations as expected for a C ₂₀H₃₃ hydrocarboncation undergoing a complex multi-step fragmentation. The ion serieswith 1, 2, 3 or 4 unsaturation(s) are connected by dashed lines.

Spectra were determined for, ¹H NMR spectra of M. tuberculosis1-tuberculosinyl adenosine dissolved in CD₃OD; COSY NMR spectra of M.tuberculosis 1-tuberculosinyl adenosine dissolved in CD₃OD; HMQC NMRspectra of M. tuberculosis 1-TbAd showing ¹³C resonances of carbon atomsbonded to at least one hydrogen(s) and the corresponding ¹Hresonance(s); and NOESY NMR spectra of M. tuberculosis 1-TbAd (data notshown). Expanded views of NOSEY showed correlation of two olefinicprotons with nearby terpenoid and adenine protons or ribose resonances.

Synthesis of tuberculosinyl pyrophosphate (TbPP). To a solution ofTTBAHPP (58 mg, 64.3 μmol, 2 eq.) in CH₃CN (1 mL), in an oven-driedSchlenk flask under nitrogen atmosphere was added a solution oftuberculosinyl chloride (15) (10 mg, 32.2 μmol) in dry CH₃CN (0.5 mL).The solution was stirred for 3 h after which TLC analysis, usingn-pentane as an eluent, indicated complete conversion of the startingmaterial. The solvent was removed under reduced pressure after which theresidue was dissolved in dry methanol and passed through a pre-washedcolumn DOWEX® 50WX2 Na—form (50-100 mesh). This process was repeatedtwice after which the methanol was evaporated. High Resolution MassSpectrometry (APCI) analysis detected tuberculosinol PP [M-OPP]⁺ at273.2581 m/z (C₂₀H₃₃, calculated m/z 273.2577). The compound was usedwithout further purification. For a more detailed procedure seeDavisson, V. J. et al (16).

We determined the structure of M. tuberculosis Rv3378c and determinedthe initial electron density map of Rv3378c. Density modified map (2.0σ, 2.30-Å resolution) from single-wavelength anomalous dispersion (SAD)phasing of a mercury derivative (Ethylmercury phosphate) dataset wassuperimposed on the model of Rv3378c (data not shoen). Fromsuperimposition of Rv3378c and Rv2361c (rmsd=2.65 for 406 Cα atoms) weobserved an ordered P-loop in a nonphysiological complex with melliticacid. We generated Ribbon diagrams of Rv3378c apo and Rv3378c:melliticacid complex (data not shown). The P-loop (residues 80-95) of apo andmellitic acid complex.

Database S1

XCMS software was used in R environment to list detected features fromthe HPLC-MS dataset obtained for M. tuberculosis and M. bovis BCG lipidextracts. Among all detected features, this list shows those featuresthat pass the filters of a minimum fold change intensity of 2 and acorrected t-test p-value<0.05.

RESULTS AND DISCUSSION REFERENCES

-   1. Dye C, Glaziou P, Floyd K, & Raviglione M (2013) Prospects for    tuberculosis elimination. Annu Rev Public Health 34:271-286.-   2. Sturgill-Koszycki S, et al. (1994) Lack of acidification in    Mycobacterium phagosomes produced by exclusion of the vesicular    proton-ATPase. Science 263(5147):678-681.-   3. Camus J C, Pryor M J, Medigue C, & Cole ST (2002) Re-annotation    of the genome sequence of Mycobacterium tuberculosis H37Rv.    Microbiology 148(Pt 10):2967-2973.-   4. Layre E, et al. (2011) A comparative lipidomics platform for    chemotaxonomic analysis of Mycobacterium tuberculosis. Chem Biol    18(12):1537-1549.-   5. Sartain M J, Dick D L, Rithner C D, Crick D C, & Belisle J    T (2011) Lipidomic analyses of Mycobacterium tuberculosis based on    accurate mass measurements and the novel “Mtb LipidDB”. J Lipid Res    52(5):861-872.-   6. Behr MA, et al. (1999) Comparative genomics of BCG vaccines by    whole-genome DNA microarray. Science 284(5419):1520-1523.-   7. Mahairas G G, Sabo P J, Hickey M J, Singh D C, & Stover C    K (1996) Molecular analysis of genetic differences between    Mycobacterium bovis BCG and virulent M. bovis. J Bacteriol    178(5):1274-1282.-   8. Brodin P, Rosenkrands I, Andersen P, Cole S T, & Brosch R (2004)    ESAT-6 proteins: protective antigens and virulence factors? Trends    Microbiol 12(11):500-508.-   9. Fortune S M, et al. (2005) Mutually dependent secretion of    proteins required for mycobacterial virulence. Proc Natl Acad Sci    USA 102(30):10676-10681.-   10. Nakano C, et al. (2011) Characterization of the Rv3378c gene    product, a new diterpene synthase for producing tuberculosinol and    (13R, S)-isotuberculosinol (nosyberkol), from the Mycobacterium    tuberculosis H37Rv genome. Biosci Biotechnol Biochem 75(1):75-81.-   11. Mann F M, et al. (2009) Characterization and inhibition of a    class II diterpene cyclase from Mycobacterium tuberculosis :    implications for tuberculosis. J Biol Chem 284(35):23574-23579.-   12. Mann F M, et al. (2009) Edaxadiene: a new bioactive diterpene    from Mycobacterium tuberculosis. J Am Chem Soc 131(48):17526-17527.-   13. Pethe K, et al. (2004) Isolation of Mycobacterium tuberculosis    mutants defective in the arrest of phagosome maturation. Proc Natl    Acad Sci USA 101(37):13642-13647.-   14. Domenech P, et al. (2004) The role of MmpL8 in sulfatide    biogenesis and virulence of Mycobacterium tuberculosis. J Biol Chem    279(20):21257-21265.-   15. Jain M & Cox J S (2005) Interaction between polyketide synthase    and transporter suggests coupled synthesis and export of virulence    lipid in M. tuberculosis. PLoS Pathog 1(1):e2.-   16. Converse S E, et al. (2003) MmpL8 is required for sulfolipid-1    biosynthesis and Mycobacterium tuberculosis virulence. Proc Natl    Acad Sci USA 100(10):6121-6126.-   17. Garces A, et al. (2010) EspA acts as a critical mediator of    ESX1-dependent virulence in Mycobacterium tuberculosis by affecting    bacterial cell wall integrity. PLoS Pathog 6(6):e1000957.-   18. Prach L, Kirby J, Keasling J D, & Alber T (2010) Diterpene    production in Mycobacterium tuberculosis. Febs J 277(17):3588-3595.-   19. Nakano C, Okamura T, Sato T, Dairi T, & Hoshino T (2005)    Mycobacterium tuberculosis H37Rv3377c encodes the diterpene cyclase    for producing the halimane skeleton. Chem Commun (Camb)    (8):1016-1018.-   20. Nakano C & Hoshino T (2009) Characterization of the Rv3377c gene    product, a type-B diterpene cyclase, from the Mycobacterium    tuberculosis H37 genome. Chembiochem 10(12):2060-2071.-   21. Hoshino T, Nakano C, Ootsuka T, Shinohara Y, & Hara T (2011)    Substrate specificity of Rv3378c, an enzyme from Mycobacterium    tuberculosis , and the inhibitory activity of the bicyclic    diterpenoids against macrophage phagocytosis. Org Biomol Chem    9(7):2156-2165.-   22. Ottria R, Casati S, Baldoli E, Maier J A, & Ciuffreda P (2010)    N(6)-Alkyladenosines: Synthesis and evaluation of in vitro    anticancer activity. Bioorg Med Chem 18(23):8396-8402.-   23. Casati S, Manzocchi A, Ottria R, & Ciuffreda P (2010) 1H, 13C    and 15N NMR assignments for N6-isopentenyladenosine/inosine    analogues. Magn Reson Chem 48(9):745-748.-   24. Casati S, Manzocchi A, Ottria R, & Ciuffreda P (2011) 1H, 13C    and 15N NMR spectral assignments of adenosine derivatives with    different amino substituents at C6-position. Magn Reson Chem    49(5):279-283.-   25. Sassetti C M, Boyd D H, & Rubin E J (2001) Comprehensive    identification of conditionally essential genes in mycobacteria.    Proc Natl Acad Sci USA 98(22):12712-12717.-   26. Mann F M, Xu M, Davenport E K, & Peters R J (2012) Functional    characterization and evolution of the isotuberculosinol operon in    Mycobacterium tuberculosis and related Mycobacteria. Front Microbiol    3:368.-   27. Kurokawa H & Koyama T (2010) Comprehensive Natural Products II:    Chemistry and Biology (Elsevier Ltd) 2010 Ed.-   28. Chang S Y, Ko T P, Chen A P, Wang A H, & Liang P H (2004)    Substrate binding mode and reaction mechanism of undecaprenyl    pyrophosphate synthase deduced from crystallographic studies.    Protein Sci 13(4):971-978.-   29. Wang W, et al. (2008) The structural basis of chain length    control in Rv1086. J Mol Biol 381(1):129-140.-   30. Guo R T, et al. (2005) Crystal structures of undecaprenyl    pyrophosphate synthase in complex with magnesium, isopentenyl    pyrophosphate, and farnesyl thiopyrophosphate: roles of the metal    ion and conserved residues in catalysis. J Biol Chem    280(21):20762-20774.-   31. Sato T, Kigawa A, Takagi R, Adachi T, & Hoshino T (2008)    Biosynthesis of a novel cyclic C35-terpene via the cyclisation of a    Z-type C35-polyprenyl diphosphate obtained from a nonpathogenic    Mycobacterium species. Org Biomol Chem 6(20):3788-3794.-   32. Sato T, Takagi R, Orito Y, Ono E, & Hoshino T (2010) Novel    compounds of octahydroheptaprenyl mycolic acyl ester and monocyclic    C35-terpene, heptaprenylcycline B, from non-pathogenic mycobacterium    species. Biosci Biotechnol Biochem 74(1):147-151.-   33. Vik A, et al. (2007) Antimicrobial and cytotoxic activity of    agelasine and agelasimine analogs. Bioorg Med Chem 15(12):4016-4037.-   34. Yates R M, Hermetter A, & Russell D G (2005) The kinetics of    phagosome maturation as a function of phagosome/lysosome fusion and    acquisition of hydrolytic activity. Traffic 6(5):413-420.

We claim: 1-4. (canceled)
 5. A method for treatment of Mycobacteriumtuberculosis comprising: a. administering a pharmaceutically effectiveamount of a Mycobacterium tuberculosis therapeutic to a subject that hasthe presence of at least one compound selected from the group consistingof a compound of Formula I, Formula II and Formula III.
 6. The method ofclaim 5, wherein the pharmaceutically effective amount of aMycobacterium tuberculosis therapeutic is administered to a subject thathas presence of at least two compounds selected from the groupconsisting of a compound of Formula I, Formula II and Formula III. 7.The method of claim 5, wherein the pharmaceutically effective amount ofa Mycobacterium tuberculosis therapeutic is administered to a subjectthat has presence of a compound of Formula I, Formula II and of FormulaIII.
 8. A method for determining if a subject is responsive to aMycobacterium tuberculosis treatment comprising: a) measuring theconcentration of at least one compound selected from the groupconsisting of a compound of Formula I, Formula II and Formula III, in afirst biological sample from a subject; b) administering to the subjecta treatment for Mycobacterium tuberculosis ; and c) measuring theconcentration of the one or more compounds of step a) in a secondbiological sample from the subject, wherein a decrease in concentrationof the compound as compared to the concentration in the first sample isindicative that the subject is responding the treatment forMycobacterium tuberculosis and reducing infection.
 9. The method ofclaim 5, wherein the compound is a variant of the compound of FormulaIII represented by Formula IV.
 10. The method of claim 5, wherein thesubject has been diagnosed as having a bacterial infection.
 11. Themethod of claim 5, wherein the subject is human.
 12. The method of claim8, wherein the biological sample derived from the subject is selectedfrom the group consisting of: breath, sputum, blood, urine, gastriclavage and pleural fluid.
 13. The method of claim 5, wherein thepresence of the compound is measured using an assay selected from thegroup consisting of: mass spectrometry (MS), nuclear magnetic resonancespectroscopy and an immunoassay.
 14. The method of claim 13, wherein theassay is an immunoassay that detects the presence of the compound/s bymonitoring the presence of host antibodies directed against thecompound/s.
 15. The method of claim 13, wherein the assay is animmunoassay that uses a non-host antibody that specifically binds to acompound of Formula I-IV.
 16. A system for analyzing a biological samplecomprising: a. a determination module configured to receive data formmeasuring a compound present in a biological sample of a subjectsuspected of having Mycobacterium tuberculosis infection, wherein thecompound is selected from the group consisting of a compound of FormulaI, Formula II and Formula III, and to optionally determine theconcentration of the compound; b. a storage device configured to storeinformation from the determination module; c. a comparison moduleadapted to compare the data stored on the storage device with referencedata, and to provide a comparison result, wherein the comparison resultidentifies the presence or absence of at least one compound selectedfrom the group consisting of a compound of Formula I, Formula II, andFormula III; and wherein the presence of the at least one compound isindicative that the subject has Mycobacterium tuberculosis infection;and d. a display module for displaying a content based in part on thecomparison result for the user, wherein the content is a signalindicative that the subject has Mycobacterium tuberculosis infection inthe presence of at least one compound of step c), or a signal indicativethat the subject lacks Mycobacterium tuberculosis infection in theabsence of each of the compounds of Formula I, Formula II and FormulaIII.
 17. The system of claim 16, wherein in step d) the content is asignal indicative that the subject has Mycobacterium tuberculosisinfection in the presence of at least two compounds of step c), or asignal indicative that the subject lacks Mycobacterium tuberculosisinfection in the absence of at least two of the compounds of step c).18. (canceled)
 19. The system of any of claim 16, wherein the contentfurther comprises a signal indicating that the subject should be treatedfor Mycobacterium tuberculosis in the presence of at least one compoundselected from the group consisting of Formula I, Formula II, and FormulaIII.
 20. The system claim 16, wherein the compound of Formula III isrepresented by Formula IV.
 21. The system of claim 16, wherein thedetermination module is configured to receive data from a MassSpectrometer.
 22. The system of claim 16, wherein the subject suspectedof having Mycobacterium tuberculosis infection has been diagnosed ashaving a bacterial infection.
 23. The system of any of claim 16, whereinthe subject is human.
 24. The system of claim 16, wherein the biologicalsample derived from the subject is selected from the group consistingof: breath, sputum, blood, urine, gastric lavage and pleural fluid. 25.The system of any of claim 16, wherein the determination module receivesdata from a mass spectrometer, nuclear magnetic resonance spectroscopy,high performance liquid chromatography, or an immunoassay. 26-30.(canceled)